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PREFACE 




Since the first edition of this book was published it appears that no 
entirely new measuring techniques have been developed, but much 
progress has been made in refining test items, m analyzing the skills 
being tested, and in applying tests to many groups for many purposes. 
The one outstanding application was, of course, in the military 
establishments during World War 11. 

The only completely new chapters are Chapter XI, which deals 
with military contributions, and Chapter XV, in which personality 
theories are discussed. The other chapters have, however, been 
thoroughly revised by including more recent material, better ex- 
planations, and more comprehensive discussion of applications But 
the primary purpose of the text is still to serve as an analytical in- 
troduction to measuring instruments, not as a final appraisal of their 
value. The real value of a test usually appears several years after its 
publication. 

The arrangement of the book is somewhat altered to allow the in- 
structor or student to specialize to meet his needs Part I is devoted to 
measures of ability, and may be used alone for courses principally 
concerned with abilities. It contains a discussion of basic considera- 
tions in measurement, of measures of achievements and aptitudes, 
and of the use of elementary statistical piocedures. Part II contains 
three chapters dealing with elementary statistics For those who have 
previously survived a course in statistics, these chapters will be a 
short friendly refresher course For those without previous statistical 
training, it is hoped that these chapters will provide an interesting 
new experience. Part HI comprises the last ten chapters which give 
an introduction to theories of personality, and present in detail vari- 
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ous evaluations of appreciation, interests, attijudes, and personal 
integration 

The complete reproduction of tests has been avoided because of 
space limitations, and also to keep them from being invalidated by 
too broad an acquaintance with them. It is expected that the in- 
structor will use a battery of illustrative material during the course, 
while observing the ethical principles discussed in Chapter I. 

Acknowledgements are made with pleasure to the following who 
have read one or more chapters and made many constructive sug- 
gestions Roger M. Bellows, Gerald S Blum, Edward E. Bordin, 
Frances Estep, Katharine Br. Greene, Max L. Hutt, Arthur E. John- 
son, E Lowell Kelly, Dons Klein, Daniel R. Miller, Carl Rush, 
George A. Satter, Charles E. Scholl, Jr.^ V. M. Tye, W. L. Wallace, 
and Gertha Williams. 

I am indebted to the outstanding leaders in the development of 
mental measurements who have, usually under protest, allowed their 
photographs to be reproduced here Several persons were too modest, 
and certainly others were inadvertently omitted. 

Lastly, I wish to express my sincere thanks to the many authors and 
editors who have given permission to reproduce drawings, tables, 
and charts which represent many hundreds of hours of painstaking 
research. 


Ann Arbor y Michigan 
March, 1952 


Edward B. Greene 
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CHAPTER I 


INTRODUCTION 




This chapter answers concisely seven important questions about the 
measurement of human behavior. 

What IS behavior? 

What is measurement? 

What agencies provide tests? 

What agencies use tests? 

What ethical standards are important^ 

What are the requirements of a good examiner? 

What are the limitations of measurement^ 

Other chapters in this book present in greater detail the answers to 
these questions. 

Definitions The words test, item, measuring instrument, scale, 
and inventory are used somewhat interchangeably, but they have 
distinct meanings of their own. The word test may refer either to an 
examining procedure or to printed questions which are used in an 
examination of skill An item is a prescribed stimulus which usually 
yields a unit score. A measuring instrument is a set of items which 
have been given a standard set of values called a scale. A scale is a 
numerical scheme of reference consisting of points or steps that are 
usually equivalent in some respect An inventory is a list of personal 
characteristics used in rating or judging oneself or others When cast 
in the form of questions the inventory becomes a questionnaire. 

WHAT IS BEHAVIOR? 

The term behavior refers to any series of acts of an individual 
which occur in a particular place during a particular time. The indi- 
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vidual may be an object, a person, or a hypothetical entity, for ex- 
ample, an electron. The acts may be thought of as purely physical 
or as involving mental phenomena No attempt will be made here 
to lay down rigid distinctions between physical and mental phenom- 
ena. In general phenomena refer to forces, movements, and qualities 
of chemical elements, or to combinations of elements, either animate 
or inanimate. Under mental phenomena are grouped acts of living 
organisms which are difficult to describe entirely in physical terms. 
These acts include wishes, memories of experiences, and beliefs, most 
of which involve symbols, and all of which seem to be dependent 
upon the activity of nervous tissue Mental acts at present must be 
studied indirectly, for brain elements are too small and too easily 
destroyed to be observed and measured directly. Many investigators 
believe that mental and physical acts can be explained by the same 
natural laws. Certainly there is no sharp line separating them. The 
nature of mental acts and mental organization is discussed through- 
out the book, particularly in Chapters II, V, VII, and VIII. 

WHAT IS MEASUREMENT? 

Broadly speaking, measurement is any kind of comparison re- 
ported in numerical fashion. All measurements involve two some- 
what independent processes: a comparison and a mathematical proce- 
dure, called scaling, which gives a number, called a score, to the re- 
sults of the comparison. Two types of comparisons are in general 
use: 

Qualitative comparisons Here, two or more persons or objects are 
compared to determine whether they have the same qualities Quali- 
tative judgments of a rough sort would be used in the examination 
of two persons to find if both can hear or if both can solve arithmetic 
problems. 

Quantitative comparisons. After qualitative similarities have been 
established, comparisons of amount can be made Judgments of this 
type are illustrated by estimating which of two persons has the more 
acute hearing, or which has more arithmetical ability. 

From these two types of comparisons, convenient units of amount 
can sometimes be specified with great precision. Chapter IV describes 
the most frequendy used units of comparisons for appraising human 
abilities. 

How Are Measuring Instruments Developed? 

Many of the measurement techniques used by psychologists were 
partly developed by persons working in physical sciences and then 
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adapted to the appjjaisal of activities of persons or animals. This 
fact makes it possible for persons well trained in measurement 
techniques to understand one another's methods fairly well, even 
though they work in different fields. All scales of measurement have 
been developed along similar lines. 

A good measuring scale is the result of many years of hard work. 
The procedure is well illustrated by the development of measures 
of the hearing of a boy who is suspected of being somewhat deaf. 
One of the crudest measures is made by observing to see whether the 
boy responds to ordinary sounds by turning his head or by trying to 
see what made the sounds. One cannot definitely establish deafness 
from such observation, however, since the child may, for example, be 
feeble-minded or uncooperative, and therefore may not respond 
normally 

For a more careful diagnosis the youngster may be taken into a 
quiet room and asked to tell whether or not he hears a watch ticking. 
The watch may be held at various distances from one ear while his 
other ear is covered. This method of ascertaining deafness is better 
than the first, but it is neither complete nor accurate. The youngster 
may be deaf to certain tones only, or he may think that he hears the 
watch when he does not. 

A further refinement in the measurement of hearing can be made 
with an audiometer, a machine which speaks into one ear at a time. 
The child is asked to report the numbers which he hears The best 
audiometers give a wide sample of vowel and consonant sounds at 
various intensities. A record may be obtained which shows all of 
the child's answers From these answers a measuie of his range of 
hearing for pitch, intensity, and vowels and consonants may be 
obtained. Because there may be some chance successes, however, the 
test may not be entirely accurate and complete Also, if the child is 
very young or mentally retarded, or if he has a speech defect, he may 
not be able to report correctly the sounds he hears Hence, this meas- 
urement of hearing is limited by the subject's ability to report sounds. 
A still more precise way of measuring hearing does not require the 
reproduction of sounds, but presents two sounds following which 
the child is asked to indicate by a simple movement whether they are 
the same or different 

There are still some unknown factors present in this situation For 
instance, we can never be sure that a child has done his best. If the 
test situation is somewhat strange and terrifying to him, he may wish 
to cooperate, but fail to do so. If the boy is suffering from a head cold 
or fatigue he may not be able to do his best. In spite of all precautions, 
therefore, one must admit that the best appraisals are only approxi- 
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matioiis lather than true measures of actual conditions. This same 
admission must be made for all measurement in any science. 

From tJiis account of the development of measurement of hearing 
it appeals that at first there existed only a rough concept of not be- 
ing able to hear. After much experimentation this concept was dis- 
caided. In jLs place three more precise concepts which can be ap- 
praised niinierically are now used These are pitch discrimina- 
tion, intensity discrimination, and discrimination of spoken sounds 
which are complex combinations of various pitches, intensities, and 
ihy thins 1 he construction of instruments for measuring each kind 
of hctu ing phenomenon requires considerable mechanical skill. Re- 
fiticments that insure still greater precision are still being discovered 
fiom Lime to time. The construction of scales is discussed in Chapters 
III, IV, and VTII, in greater detail. 

In Older to standardize a test an extensive program of preparations 
IS necessaiv Such a program includes the following steps 

] . Decide specifically what is to be measured, and how. This step 
should give the test a clear objective. 

2. Secuie a large number of sample items or defined stimulus situa- 
tions in Older to guarantee a good coverage of the area to be tested. 

3 liy out the items on small but representative groups having 
known chaiacteristics This step furnishes a basis for validating items 
and the test as a whole. Validation is the process of finding out what 
value a test has in a particular situation. 

4 Analyze the responses to each item to determine such attributes 
as content, relation to other items, relation to criteria of success, 
and difficulty. In this way the best items are selected and the others 
are revised. 

5. Revise items to make them more significant in obtaining the 
objectives. 

6. Cross-validate the items, that is, repeat steps 3, 4, and 5, using 
a new group of persons This step is necessary to avoid chance or 
random errors in the first tryout. 

7. Prepare final revisions. This step yields two or more equivalent 
forms of highly important items, with the best arrangement for 
administration and scoring. 

8. Secure standard results by testing large groups of persons se- 
lected as representative samples. 

These eight steps require careful planning, great determination, 
time, the cooperation of many persons in school and industry, and 
usually a considerable amount of ready money. 
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Sources of Informa^on about Tests 

Since testing is developing rapidly, you may wish to know where 
the most recent information about tests can be found. Publishers 
are usually well informed concerning the use of a test. For technical 
criticisms, read the journals or Buros, Mental Measurements Year- 
hooks (Rutgers University, New Brunswick, New Jersey) Two ex- 
cellent periodicals containing abstracts are found in many libraries. 

Psychological Ah streets ^ published by the American Psychological 
Association (1515 Massachusetts Avenue, Washington 5, D.C) in- 
cludes the sections shown in Ulus. 1, many of which refer to measures 

ILLUS 1. CONTENTS OF PSYCHOLOGICAL ABSTRACTS 
GENER.\L 

Theory & Systems • Methods & Apparatus • New Tests • Statistics • Refer- 
ence Works • Organizations • History & Biography • Professional Problems 
of Psychology 

PHYSIOLOGICAL PSYCHOLOGY 
Nervous System 

RECEPTIVE AND PERCEPTUAL PROCESSES 
Vision • Audition 

RESPONSE PROCESSES 

COMPLEX PROCESSES AND ORGANIZATIONS 

Learning & Memory • Thinking 8s Imagination • Intelligence • Personality 

• Aesthetics 

DEVELOPMENTAL PSYCHOLOGY 

Childhood 8c Adolescence • Matunty & Old Age 

SOCIAL PSYCHOLOGY 

Methods & Measurements • Cultures & Cultural Relations • Social Institu- 
tions • Language 8c Communication • Social Action 

CLINICAL PSYCHOLOGY, GUIDANCE, COUNSELING 

Methodology, Techniques • Diagnosis & Evaluation • Treatment Methods 

• Child Guidance • Vocational Guidance 

BEHAVIOR DEVIATIONS 

Mental Deficiency • Behavior Problems • Speech Disorders • Cnme k Delin- 
quency • Psychoses • Psychoneuroses • Psychosomatics • Clinical Neurol- 
ogy • Sensory Defects 

EDUCATIONAL PSYCHOLOGY 

School Learning • Interests, Attitudes 8c Habits • Special Education • Edu- 
cational Guidance • Educational Measurement • Education Staff Personnel 

PERSONNEL PSYCHOLOGY 

Selection 8c Placement • Labor-Management Relations 

INDUSTRIAL AND OTHER APPLICATIONS 
Industry • Business 8L Commerce • Professions 

^By permission of the Editor of Psychological Abstracts,) 
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Child Development Abstracts, published by^the Society for Re- 
search in Child Development (National Research Council, 2101 Con- 
stitution Avenue, Washington 25, D.C.) contains the sections shown 
in Ulus 2. 


ILLUS 2 CONTENTS OF CHILD DEVELOPMENT ABSTRACTS 

I ABSTR\CIS or \RriCLES 

A Af 010*1 loi OGY. Anatomy; Embryology, Anthropometry, Somatic Constitution 
B P£i’\sioioc,\ AND Biochemistry* Growth, Endocrmes; Hormones, Nutrition; 
VitJimns 

C Clinic A i Mpdicine and Pathology: Dentistry, Immunology, Diagnostic 
1 ests 

D. PsYcnoiocY Behavior; Intelligence, Learning; Personality 
h Ps^ cm M R\ AND Men pal Hygiene Crime, Delinq uency 
1* Public Hfalth and H\gilnl Epidemiology, Morbidity, Mortality 
G Hum\n Bioiogy and Demography, Genetics; Natality and Fertility, Popu- 
lation, Race and Sex Differences 
H Edicaiion Class Curriculum, Vocational Guidance 
I SofloLoc.^ and Economics. Laws, Family; Marriage and Divorce 

II BOOK NO I Id's 
\UlIIOK INDFX 

(Bv pciiinssion of the Editor of Child Development Abstracts) 


WHAT AGENCIES PROVIDE TESTS? 

In the past many standard tests have been prepared by government 
agencies, nonprofit organizations, and private individuals. 

Government Agencies 

Four types of government agencies have designed or adapted stand- 
ard tests, civil service jurisdictions, military establishments, the 
United States Employment Service, and offices of education. The civil 
service groups include federal, state, and municipal agencies, all of 
which provide millions of aptitude or skill and knowledge tests each 
year as part of the qualifying examinations of job applicants. Prac- 
tically every type of job is covered. The military establishments have 
found that aptitude and performance tests aid enormously in the 
most effective selection of men for training or skilled jobs. The 
United States Employment Service has developed specific batteries 
of aptitude tests for certain occupations, a General Aptitude Test 
Battery, and a series of short Oral Trade Tests, to detect bluffers and 
to recognize skilled journeymen. The state departments of educa- 
tion of New York, Ohio, and Indiana arc among the few school sys- 
tems that publish tests of academic achievement. 
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Nonptofit Agencies* 

Among nonprofit organizations that publish tests the largest is 
cloubtleas the Educational Testing Seivire, Princeton, New jersey. 
It Avas giantcd a charier in lO'lT as a nonprofit coipoiatjon in the 
State oL New Yoik. Tt has no stockholders and is undei the complete 
control ol a disiinguishcd boaid ol tiustees who represent many areas 
in education It iiniies in a single organization the resting activities of 
the following three piev lously independent groups (a) the Amei ican 
Council on Education, which sponsors the Psychological Examina- 
tion for high school giaduates, the Cooperative Test Scivice lor 
higli school achievement tests, and the National Committee on 
Teachei Examination, (b) the College Entrance Exauiination Boaicl, 
winch issues annual tests coveiing high school subjects and also a 
scholastic aptitude lest, (c) the Carnegie Foundation lor the Ad- 
vancement ol Teaching, wdnch prepares examinations of achieve- 
ment and aptitude ol college giaduates 

Ihe Fducational Testing Service coordinates the w'Oik of these 
three groups and eliminates unnecessary duplications It also under- 
takes basic research, and explores new areas in the field of testing, 
using giants fiom various foundations 

Private Agencies 

Piivate agencies and individuals still account for the pi eduction 
of the largest vaiicty of tests on the market A list of publishers is 
given in \ppendix T and a list of tests m Appendix II.^ 

In the past the most active test development has been in measur- 
ing intelligence and achievement in school subjects in grade schools 
and high schools College achievement testing is fairly well devel- 
oped, and recently industrial testing has made great strides School 
achievement tests arc now enlarging their function by measuring re- 
sults ol education other than subject mastery, for example, good 
personal adjustments vocationally sociall), and as a citizen in a 
democracy aie being measured by this lype of test Meabutes of in- 
terests and personal adjustments or drives are also now being devel- 
ojjcd rapidh 


WHAT AGENCIES USE TESTS? 

Four kinds of agencies — educational, industrial, clinical, and civic 
— ficquently applv standard measures ol behavior. 

Educators use tests both for individual diagnosis and promotion, 
1 See pp 7^1 IT 
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and for appraisals of a method of instruction «or of an instructor. 
The diagnostic use of tests can be of great benefit to the student, if 
his failings are recognized and a remedial course is made available to 
him. The greatest benefits are derived when vocational and educa- 
tional counseling are combined with teaching, for counseling in- 
volves a plan of individual development. In order to make such a 
plan the counselor needs the detailed information yielded by the 
most accurate tests. 

The use of standard tests has in some instances led to undesirable 
results. Teachers have felt driven to prepare their pupils to meet 
certain test requirements rather than to develop m the students a 
mastery of skills in a reasonable sequence. Many astute educators 
consider that this drive to insure that pupils meet certain require- 
ments is a serious menace to good teaching Surveys by means of 
tests are of considerable value both to administrators and to teachers 
when the goals of instruction are not sacrificed to coaching or cram- 
ming procedures. 

Industrial agencies have used tests principally in the selection or 
promotion of employees. Civil service departments use more tests 
than any other agency, and their use has on numerous occasions in- 
creased the effectiveness of an employed group. Clerical workers in 
private industry have also been frequently appraised by standard 
tests Large industrial agencies are beginning to use tests for individ- 
ual guidance in order to determine whether an applicant is well fitted 
for some position for which he did not apply Tests have also been 
used in directing employees toward various courses of training In the 
realm of production standard tests and questionnaires are sometimes 
used to detect the effect of fatigue, monotony, physical and social 
working conditions, and payment systems In the field of merchandis- 
ing standardized questionnaires are widely used to ascertain the effects 
of printed advertisements and radio programs on various groups. 
Chapters IX and XI give samples of the use of tests in industry and 
military services 

Clinic^ agencies at times deal with persons who are mentally ab- 
normal in some degree — the feeble-minded, the psychotic, the epi- 
leptic, the emotional deviate, and the delinquent. A few special tests 
of moral and neurotic tendencies have been constructed, but the 
majority of tests applied only in the clinical field are designed to aid 
in determining the deeper aspects of personal integration. 

Civic leaders and agencies are becoming aware of the accuracy of 
appraisals of public opinion on controversial issues. Rough straw 
votes of unrepresentative samples of a group of voters have been 
found to be unreliable, but when a few unambiguous questions of 
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fact or preference arf presented, the results are significant. The devel- 
opment of more agencies for appraising attitudes toward economic, 
social, and political issues is going on rapidly These will have a 
marked effect on political actions in a democratic state. 

It IS interesting to note a marked tendency among the different 
types of agencies to use the same types of tests The school is inter- 
ested in vocational success, mental health, and the development of 
character and self-government; hence it uses all types of tests. In- 
dustrial agencies find that success in school is one of the best indica- 
tors of success in business They therefore use many tests of educa- 
tional achievement, as well as tests of character and vocational fitness. 
Clinical agencies are interested in restoring a person to normal life 
in school or the community, so they try to evaluate all aspects of a 
person. All of these agencies are becoming convinced that the whole 
person has to be considered in any adequate plan for his social devel- 
opment or continued employment. 

ETHICAL STANDARDS FOR DISTRIBUTION OF TESTS 

No one knows exactly how many mental tests are given each year 
in the United States, but several have estimated that approximately 
20 million Americans take about 60 million tests As in the case of 
any large-scale enterprise, careless and unethical practices have at 
times arisen. The American Psychological Association established a 
committee on ethical standards, and a subcommittee on tests through 
its chairman, Donald Super, made a report in 1949 The report con- 
tains a long list of unethical incidents, and defines ethical practices. 

The following situations taken from the report illustrate the use 
of unethical practices: 

1 A personnel man employed by a medium-sized steel company called 
for advice on a testing problem He had given a battery of well-known tests 
to candidates, had scored them, and wanted to be told over the telephone 
what he should use as a passing score. He had made no validating studies 
and had no idea that they should be made. 

2 An executive was greatly perturbed about a series of personality tests 
appearing weekly in a magazine with the name of a lecturer in psychology 
in a university attached to the test The office manager cut out the tests 
from week to week and administered the tests to his office staff and then 
gave back interpretations This procedure caused a lot of unrest in the 
office and the executive told his office manager that no more tests were to 
be given in his organization. The office manager claimed the tests were 
very good as they were published by a member of the psychology staff of a 
near-by university 
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3 A scoring service issues tests and sends scores Jo private individuals, 
oven though its official policy is not to do so A number of persons have been 
seen who have been hiurt by this practice of leaving test interpretation to 
untrained individuals. 

4 A local firm using psychological tests in consulting work employs no 
psychologists One staff member took a course with an industrial psychologist 
teaching in a near-by university and the firm implies that the psychologist is 
associated with its operations. The firm seems to have no difficulty getting 
tests. 

5 In one large company a group of personnel workers were studying 
testing at a near-by university They administered tests they were studying 
to employees and counseled them, sometimes even going over the scoring 
'ivith them In doing so they not only failed to make use of local validation 
data available in another section of the same department, but interfered 
■\\ ith validation studies and the promotional use of these tests by that section. 

6 A widely publicized test developed by the federal government was 
released to a commercial publisher for civilian publication. As the test was 
of a type outdated even when first developed, and was released because of 
the development of an up-to-date type of substitute, the publisher is actively 
marketing an inferior product under unusually good auspices 

7- A book on a projective test depicts it as entirely new and validated for 
screening: it is actually a revision, and the conclusions concerning valida- 
tion have since been uniformly contradicted by a number of careful studies 
by other investigators. 

8. The manual for a well-known test cites a number of studies showing 
its validity in practical use, but fails to cite equally good studies showing 
unfavorable results. 

another such case the manual reports validity coefficients against 
‘^ratings on vocational courses as high as 84” [italics added] without de- 
scribing the groups tested or citing any of the other, implicitly lower, corre- 
lations found 

10, An interest inventory standardized on 12th grade students was ad- 
vertised as suitable for use with high school, college, and adult populations 
However, work with another interest inventory has demonstrated signifi- 
cant changes m certain types of interest m adolescence and early childhood. 

11 A book on executive ability gives the impression that the author^s test 
of execuave ability” is well validated Investigation showed that the au- 
thor actually had no data which could be examined, either in raw or in 
analyzed form, the ostensible reason being their confidential nature. 


The foUowing five ethical practices and the rules for applying them 
are summarized from material in the report by Super: 

1. Preparation Those who prepare tests have the responsibility 
of carefully describing their procedures and of securing adequate 



INTRODUCTION 13 

norms and evidence^ of validity. The limitations of a test should be 
clearly stated in the manual. 

2- Publication. A test, except for experimental purposes, should 
not be published before it is carefully prepared and standardized. 
Unjustifiable claims are an indication of lack of responsibility The 
publication of parts of standardized tests in popular magazines or 
books may invalidate the test 

3. Application. No one should recommend or assume respon- 
sibility for a testing program who is not thoroughly qualified. Ad- 
vertisers or representatives of publishing houses should usually not 
serve as consultants on testing programs, and psychologists who 
recommend the publications of only one company should be viewed 
with suspicion. Those assuming responsibility for a testing program 
should always have continuous firsthand supervision of the program. 

4. Teaching. Persons teaching the administration and interpreta- 
tion of tests should admit only students who have the prerequisite 
training. Test materials should be retained only by graduate stu- 
dents who will use and protect the material properly. For didactic 
purposes the persons to be tested should be given a reasonable satis- 
faction for their contribution. 

6. Release of Scores. Individual test scores should be released only 
to those who can make a reasonable interpretation of them. They 
should not be released if thev are likely to result in discouragement 
or social oi emotional distiu bances. 

THE REQUIREMENTS OF A GOOD EXAMINER 

The preceding description of the most Irequeiit uses of tests leads 
to the question, ’t\hat are the ncccssarv qualifications of one who is 
competent to administer and interpiet test lesults^ 

Five qualifications seem necessaiy. First, a good examiner must 
know why and how to build up a clear set of concepts Securing this 
knowledge is often the most dilficiilt part of the training, £oi con- 
cepts, even the simplest, aic difiicult to understand and to keep from 
becoming ambiguous This difficulty is particulaily tioublesome in 
the measuieinent of behavior, loi the basic concepts are intangible 
and still somewhat controsersial The analysis of the piocesses in- 
vohed in a response is one of the major pcisistent problems of 
psychologists 

Second, a good examiner must be familiar with the best testing in- 
struments. To measure the power of a gasoline engine, one must have 
a well-standardized device for measuring work done in a given period. 
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Since a person responds in many more ways than does an engine, 
mental measurement is considerably more difficult than measuring 
a gasoline engine. 

Third, a good examiner must know when he has obtained a good 
sample of performance. Thus, in measuring the power of an engine, 
the tester must control his experiment. The engine must be run at 
a standard rate, using standard temperatures and standard fuels and 
lubricants. Only with such controls can its power be compared ac- 
curately with that of other engines using the same standards In 
mental measurement controls ai'e just as necessary but far more dif- 
ficult to secure If one wishes to have a good representative sample 
of a person's speed of reading, it is important that he have not only a 
good reading test, but also control of motivation, fatigue, and ac- 
curacy of performance. 

ILLUS. 3 VARIABLES IN A TEST SITUATION 


^•^Disiraeifom m th& room 



In test situations there are usually a fairly large number of factors 
present in various unknown amounts that cannot be well controlled, 
illustration 3 gives some indication of a number of variables present 
in a test situation. A person being tested reacts to a number of forces, 
some of which conflict with others For instance, his desire to do well 
on the test may be in conflict with his desire to take it easy, or to do 
something more entertaining. His fear of failure may be strong 
enough to set up disturbing thoughts and physiological reactions. 
He may be fatigued or suffering from a cold, or a fever, or indiges- 
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tion. He may be disjjracted by objects in the room or by others with 
whom he is competing. The examiner may inspire or discourage him. 
He may have obsessions or delusions which make it difficult for him 
to keep his mind on the task. It is a prime concern of the examiner to 
evaluate such forces m the test situation, and in many instances this 
evaluation is more important in appraising behavior than are the 
results of the test. In tests of adjustment, interests, or attitudes, the 
examiner's task is to control and vary the situational stresses so that 
fairly typical reaction patterns will emerge. Some have taken into 
account the external forces which impinge on a person under the 
general heading of press or outside stimuli, representing them by 
arrows. The forces which originate within the person are grouped 
under the heading needs or drives. These terms should be defined 
more carefully, however, before they are accepted for use. 

To a large extent we select the stimuli to which we pay attention. 
For instance, you may not notice a radio program in the next room, 
but I may be driven nearly mad by it. This employee or pupil cringes 
and withdraws from me, although I talk and act in a friendly man- 
ner. The examiner may represent an unreasonable, and possibly 
imagined, authority, or the examinee may have stolen something, 
and be fearful of being discovered. The task of a good examiner, then, 
is not only to ascertain and record the facts of behavior, but also to 
determine the causal sequence in which they occur. 

Fourth, a good examiner must have the ability to judge and use 
available norms, which are the scores of representative groups. Un- 
fortunately, most of the available norms are from small or specially 
selected groups. A person who is well trained in measurement will 
know the special derivation of each set of published results and be 
able to make allowances for it. 

Fifth and finally, a good examiner must have ability to report and 
interpret correctly test findings. If the measurement is part of an 
experimental procedure, then the findings must be checked with the 
hypothesis, if the measurement is for individual purposes, then the 
limits of prediction should be known and specified. Interpretation 
is one of the most important and difficult parts of the work. A large 
number of persons who can give tests under standard conditions can- 
not make accurate interpretations. These persons are useful as labo- 
ratory technicians, but they should be supervised by a well-trained 
specialist. 

The usual training for an expert in the field of mental measure- 
ment includes undergraduate emphasis in college upon mathematics, 
biology, sociology; and at least three years of graduate work, specializ- 
ing in statistical procedures; experimental, systematic, and abnormal 
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psychology; and the theory and application of numerical appraisals 
of behavior. 

LIMITATIONS OF TESTING TECHNIQUES 

Thus far the nature of mental measures and some of their uses 
have been described. It behooves us to become aware of certain of 
their limitations that are often overlooked Three types of limita- 
tions are important 

First, mental measurement techniques cannot be expected to make 
decisions for a person. They can only present the evidence more 
clearly. From careful measurement it may appear that a student’s 
chances of average success in a course in electrical engineering, or in 
second-year French, or in medicine, are one in one hundred But the 
tests cannot decide for a person whether or not he shall attempt the 
course or the profession Often a person must experience failure in 
order to be made to realize his limitations. 

Second, the best tests available cannot, at present, predict with 
g^eat accura^ what a person will do in complex learning or voca- 
tional situations. This limitation exists because people vary in mo- 
tives,^ emotional balance, social acceptance, health, opportunities, 
and in many other ways whidi are not checked by a series of tests. 
Even from a very complete study of a person, predictions of success 
ten years later have seldom been very accurate. To be sure, most of 
the predictions of success in college or industry from a particular test 
or a battery of tests are much better than chance, but still they are of 
limited value for individual counseling. Chapter III discusses how 
accurate various predictions of behavior should be in order to be of 
value to individuals. 

Third, mental tests ordinarily cannot show why a person made a 
particular score, but only that he did make the score. This is true 
of all measures of behavior. For instance, within certain limits, the 
speed of an automobile can be measured in miles per hour, but the 
speed record does not indicate why it goes slowly or rapidly. A large 
number of factors, such as engine and chassis design, road and wind 
friction, fuels, and oils, affect the total speed. The contributory fac- 
tor in measures of human behavior are considerably more complex. 
Thus, a speed-of-reading test score depends upon such variables as 
visual acuity, fatigue, verbal information, verbal skills, and desire to 
succeed, about which one is often uninformed. As more careful con- 
trols are exercised in testing, the unknown variables will be reduced, 
and the scores will become more precise and valuable. 
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Itudy guide questions 

1 Define succinctly test inventory, measuring instrument, item, rating, 
scale, estimate, belief, behavior, validation. 

2 What are the mam differences between qualitative and quantitative 
judgments? 

3. How are good measuring instruments developed? 

4 What facts should be known about each item and why? 

5 Why should a standardized test be cross-validated? 

6 What types of agencies produce tests? 

7 What advantages and disadvantages are there in having many agen- 
cies competing in the development and distribution of tests? 

8 Why do most types of agencies now tend to use the same types of 
evaluations^ 

9 Summarize the ethical principles for test publication and distribution. 

10 List the elements of training required for examiners 

1 1 Why are tests not a panacea for solving many problems of a personal 
nature in schools and industries? 



CHAPTER H 


TYPES OF APPRAISALS 




This chapter gives an over-all view of almost all available types of 
appraisals of human behavior, presents a few examples of appraisals, 
and refers to many which are discussed in other chapters. Also, it 
contains a classification of tests according to purpose and type of 
administration. 


PERSONAL TRAITS 

There are two main approaches to measurement In one a person 
is considered a complex unit, and a general intelligence or general 
adjustment score is sought which will reflect total functioning. In 
the other a person is considered to be made up of many related parts, 
some of which may function with considerable independence. Sepa- 
rate measures are sought for each independent part or for its cor- 
responding pattern of behavior. The nature and degree of relation- 
ship between parts are subjects of considerable research. The second 
approach, being analytical, has been on the whole more fruitful than 
the first, but both have made their contributions. 

The word trait in this text is used to refer to any physical aspect of 
a person, such as height, size of brain, or pulse rate; or to any mental 
aspect, such as speed of reading, attitudes toward war, or ideals about 
home life. Traits usually have three important attributes: intangibil- 
ity, multiple causation, and normal distribution. 

Intangibihty, A few traits, for example, height and weight are 
tangible, that is, they can be measured directly. Most others — read- 
ing ability, fears, needs, etc— are intangible, and can only be in- 
ferred from a series of observations after the behavior which indicates 
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the trait is carefully |‘eported, classified, and counted. This procedure 
is called indirect appraisal, and it always involves the weighting of 
Items to produce scores. 

Multiple causation. All traits are the result of a large number of 
factors. Thus a person's height at any time is the result of an in- 
herited tendency to growth of bones and connecting tissues, which is 
related to a goodly number of genes, and also the result of nutrition, 
exercise, and posture. These contribute to a complex growth curve 
for stature, which is not the same as the growth curve for weight, 
teeth, or mental ability. A person's attitude toward the Chinese will 
depend upon such factors as his age, experience, the color of his own 
skin, his economic status, the attitudes of his family and friends, and 
his independence of thought. 

Normal distribution Among persons selected at random from a 
large population the amount of a trait possessed by each will vary 
from a very small to a very large amount, with a large proportion of 
the group possessing a moderate amount of the trait. When the re- 
sults are presented graphically they most often form a normal or bell- 
shaped curve (Ulus. 121). 

All traits, even those that seem to be the least complex, are ap- 
parently related to many others. It is possible, however, and also de- 
sirable, to classify traits according to their similarities and differences. 
A good classification will prevent conjecture and result in much more 
accurate measures. 

It is generally agreed that there are at least five major groups or 
categories of traits physical, bodily reactions, cognitive, motiva- 
tional, and integrative. The names of these groups are, however, 
not all well agreed upon as yet, hence definitions are in order. 

Physical Traits 

Physical traits are those derived from the structure and materials 
of a person's body: the size and shape of limbs, bones, and various 
organs, color and texture of eyes, hair, skin, tissues, and the like. 
They are not described in this text except with reference to the work 
of Sheldon (1940) in Chapter XV. 

Bodily Reactions 

Bodily reactions give rise to four kinds of traits. Physiological 
traits arise from reactions of the involuntary muscles and glands — 
vascular changes, breathing, temperature regulation, electrical con- 
duction, chemical changes, and tensions or pressures. These reactions 
determine basic energy reserves and expenditures. Reaction times 
to stimuli are different from physiological changes in that they usu- 
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ally involve sense organs and voluntary as welV^s involuntary inus" 
cles. The central nervous system is also essential. 

Psychophysical sensitivity is indicated by the speed and accuracy 
of one’s sensory acuity, by arousing specific sense organs with care- 
fully prepared stimuli. These may involve vision, hearing, touch, 
pressure, taste, temperature, smell, and movement. Motor skills, such 
^ dexterity, athleuc skill, endurance, agility, and strength, are still 
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accuracy of discrimination from di- 
rect comparison of such stimuli as 
sound, chemical, temporal, and spa- 
tial patterns. Tests of perception are 
illustrated by a number and name- 
comparison test (Illus 8) and by a 
form-comparison test (Illus. 147). The 
unit of measurement for these tests is 
usually a judgment of same or differ- 
ent, and the raw score is the number of 
correct judgments in a standard series 
that can be made in a given period of 
time, usually from 3 to 10 minutes. 

Another group of tests that are usu- 
ally classed as perceptual are called 
attention-span tests. In these, visual 
stimuli are exposed for a brief period, 
say one tenth of a second, and then 
the subject is asked to reproduce or to 
name them. These tests are not widely 
used because the necessary lighting and 
timing are difficult to control. They can best be given in a labora- 
tory. 

More widely used are the immediate-memory-span tests. These, 
which are also sometimes placed in this perception group, require 
the reproduction or recognition of stimuli just observed. In tihese 
tests stimuli are usually presented one at a time, and the subject is 
asked to reproduce each exactly as given. Auditory spans for digits, 
disconnected words, sentences, and bead chains are used in the 
Stanford-Binet Test, Terman and Merrill (1937). Baker and Leland 
(1935) developed tests of visual spans for letters and small pictures. 
Seashore (1919) standardized a test for immediate memory of short 
tonal phrases. These tests are not dependent solely upon perception 
of pattern; they also involve a short retention span. It seems likely, 
however, that good retention over short periods depends upon clear 
perception of the elements and their relationships. The score for 
each of these span tests is usually the number of elements recorded 
correctly during a standard testing situation Sometimes the longest 
senes of items that can be reproduced is taken as the score. 

Learning, Although learning often takes place during a test, and 
many definitions of intelligence include learning as its most im- 
portant element, tests designed to measure learning are not common. 
This lack of tests of learning is due to the slowness with which learn- 
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[LLUS 8 MINNESOtI VOCATIONAL TEST FOR CLERICAL WORKERS 

(Arranged by Dorothy M. Andrew under the direction of 
Donald G. Paterson and Howard P. Longstaff) 

Name Date 

TEST 1 — Number Comparison TEST 2-— Name Companson 

Number Right Number Right 

Number Wrong ■ Number Wrong 

Score = R — W Score == R — W 

Percentile Ratmg Percentile Ratmg 

Instructions 

On the inside pages there are two tests One of the tests consists of pairs of names 
and the other of pairs of numbers If the two names or the two numbers of a pair 
are exactly the same make a check mark ( V) on the Ime between them, if they are 
different^ make no mark on that Ime When the examiner says “Stop !” draw a 
line under the last pair at which you have looked 

Samples done correctly of pairs of Numbers 
79542 79524 

5794367 V 5794367 

Samples done correctly of pairs of Names 
John C. Lmder John C Lender 

Investors Syndicate V Investors Syndicate 

Now try the samples helow. 

66273894 66273984 

527384578 527384578 

Nciv York \\ oild New York World 

Cargill Cram Co Cargil Giam Co 

This is a test for Speed and Accuracy Work as fast as you can without making 

mistakes 

Do not turn this page until >ou are told to begin 

(Copyiight 1933 Reproduced by permission of The Psychological 
Coiporation, 522 Fifth Avenue, New York City) 
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ing usually takes place and the many serious difficulties encountered 
in measuring it. 

Tests which are doubtless affected by speed of learning include 
the digit-symbol test (Illus, 9) In this test the numbers are to be 
translated into the symbols which are printed immediately beneath 

ILLUS 9 UNITED STATES ARMY BETA TEST 4 DIGIT SYMBOL TEST 


Test 4 
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the members at the rop of the test page. Other examples of this type 
of test are the indexing and classification tests shown in Ulus. 10, and 
various tests of code writing. All of these require the learning of 
associations. The unit of measurement is the correct placement of 
a symbol according to the code. The total score depends upon the 
speed and accuracy with which symbols are placed. In laboratory 
situations speed of learning with mirror drawing, mazes, and non- 
sense syllables has been extensively studied. In school situations 
measures of the acquisition of language and mathematical skills 
have been given much attention. These studies result in estimates 
of speed of learning, but few standardized tests have resulted. 

Reasoning, The outstanding characteristic of reasoning is the 
solution of a problem by the production of a new pattern of behavior 

ILLUS. 10 O’ROURKE CLERICAL APTITUDE TEST, JUNIOR GRADE, 
(CLERICAL PROBLEMS) 

Form 1 

O’Rourke Series of Aids m Placement and Guidance 


Mmr YODK trAME „ , 

(Ummb.} (RmbmO ikhdOietMiUll 

Wnttt the last school grado you coupi.ErBO._^ 



You wdl have five nu&utcs to study the saiuples this page The teats on the foUoHing pages jure hJee these. Be sun you 
undentand the samples 


FILE nKV^\£RS 


Semple I You ere to file elphebeticelly in the file 
drawers shown et the right Each drawer containa 
aix folders After each of the names luted below the 
third file drawer, you are to write the number of the 
folder in which that name should be filed 



The fir^t name, " ^ppcl,” should lie fdrd in the 'oldi r for nn nes f-om \p to As This folder is aumberrd J, so 
3 IS wnttep a^lcr the i amt "Appel” The second nnmp, "f op 1,” shovld be tiled bbtucen Ca a d Cg, which la 
folder ho 7, so 7 u «rilun after it "Denhj " should be filed n oldi.r "Da-Dh”, so vpitl 10 smit "OEKDlf”. 
NlxI unte the number of the folder in which "Earr' ahould be filed 'Pil number of I‘ip correct folder is H 


Nmoi* Feiaer 

1 Aipet 7 

2 Ctrl 7 

1 Drnby 

4 lari — 


Sample II Wnte a C aftar the nema of each man in the P 

lut et the right who IS a teacher, la between 25 and 40 UaA J K 

yeara of ago, and reaidae in aithar Indiana or lllinoia. ” 


Agt Oeeitpal m RetJtnet 

M ItMcbir Mirhiqan - 

22 TtArhrr Iniiiii i C 

23 Slsikiil lull lu _ _ 

31 iMehLr IMinoe 


Nothing IS written after Mr Bcarh's name, as his trsidenn is ntither Indiana nor lliu.ois C ib wntton after Mr Mark's name, ns 
be 18 a teseher, is between 25 and 10, and n siJls in Irdians h ou should not wnte anything -if'cr hfr Savoy s name, he u rot a 
rite (1 after Air Bard'b roinc as hi is a teacher, la 31 >pa» old, and is from Illino ' 


Samplo 111. If tha nama, tha addrau, and the charge arc not exactly the eame in the copy aa m the original, X la to be 
written on the line at tha right If tha copy la the eame u the original, write S 


Original 


Copy 


JlTflinf Adirtu CAtii* 

Ktinf 10 SSOakSi, Sii>95 

GniT«, L R. 21 Vn Avc , tlC lb 

OKit,PW ISLadSl, «&00 


Na^f AdJnu 

JItni,BG teOttkSi, 

C-ata L R *t ArzAjr, 

OHr‘,P» ISEndSU, 


CAoiy* 

43S9J 

21919 

SJ090 


Cheek 

here 

X 


The first n marked X because the addrrss a as copied inconeetly S is wnttrn aflpr the u road, as thp narui. adclnas, and charge 
vcie all copied euctly as in the originsl In Ihi third, a mia'akc was made in copy mg the cliargi, so wnirr X on tub liwc 

DO MOT TURN THIS PAGE 

(By pcimission of the INydiological IiistiLiitc, WdNhington, DC) 
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out of experiences which were previously arran|^ed in other patterns. 
Under reasoning are included tests of inference and of problem solv- 
ing. 

In inference tests the subject is asked either to detect inconsisten- 
cies or to make inferences of his own. In the simplest form two 
objects or statements are presented, and the subject must show their 

ILLUS 11 SYLLOGISM TEST 

Directions: In each section below, read the first sentence and the line marked 
If what a says follows from the first sentence, make a circle around the T in hne 
If what^ says is false according to the first sentence, circle the F If the statement 
in a need not follow from the first sentence, circle the Q. Do the same for hnes b 
and £ The sample below is marked correctly. 

Sample: All good dancers dance frequently, the men 
frequently; therefore, — 

a. The men m this house are not good dancers 

b. All frequent dancers are good dancers 

c. Some good dancers are m this house 

Form A, 

1. AU the people living on this farm are related to the Joneses; these old men 
live on this farm, therefore, — 

a These old men are related to the Joneses T F Q 

b All the people related to the Joneses are these old men. T F 

c Some people related to the Joneses are not these old men. T F Q * 

(After Wilkins, 1928) 

relationships by comparing each with a third This form is seen 
in a syllogism test (Ulus. 1 1) in which the subject is asked to check 
those inferences which can be correctly deduced from preliminary 
statements. Other lUrstrations are reading tests (Ulus. 67 and II- 
lus. 68) m which the subject is asked to interpret a paragraph. A 
more complex form of reasoning test is shown in the interpretation 
of saentific data (Illus. 12) In this test the subject is asked to check 
the inferences which may be correctly drawn from an account of a 
situation in which four or five variables are involved. The unit of 
measurement m these tests is usually a judgment of true, false, or 
undetermined relationship 

Other fairly common forms of tests which involve reasoning are 
analogies, page 231, and disarranged sentences, Illus. 69 In both of 
these, isolated words or figures are to be matched or arranged accord- 
ing to relationships indicated in a context. 

In problem-solving tests a person must become aware of a prob- 
lem, select a hypothesis or plan for solution, and then apply this 


in this bouse do not dance 


© F 
T F I 
T (D 
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ILLUS 12?* COOPERATIVE CHEMISTRY TEST 


A. Setection or Facts 

IHredions: Following are a number of incomplete statements each of which may 
be completed by one or more of the words or phrases given below the statement. 
Place a plus sign (+) in the parentheses after those words or phrases which will 
make the statement true, as in the foUowmg sample. 


Sample, Oxygen is an element which 

a Acts chemically as a metal ( }a 

b Unites with hydrogen forming water ( + ) b, 

c Is a good conductor of electnaty ( ) c. 

d Is rarely found in nature ( )d 

e Supports combustion ( + )e 


B. Terminology 


Directions: Below is a numbered list of chemical terms arranged in alphabetical 
order Following the list are several defimtions or descriptions of terms used m 
Chemistry Bead each definition or descnption, deade what term it is, then place 
the number of the term m the parentheses after the definition or descnption. 


1 Aadsalt 

2 Aliphatic compounds 

3. An^gam 

4. Anhydrous 

5. Aromatic compoimds 

6. Atom 

7 Atomic number 
S Basic salt 
9 Carboxyl 
10. Cntical temperature 

11 Esters 

12 Heat of solution 

13 Hydrogenation 

14 Inversion 

15 invertase 


16 Ketone 

17. Kmdlmg temperature 

18 Metalloids 

19 Molar solution 

20. Mole 

21. Molecule 

22. Monel metal 

23. Nonnal salt 

24 Osmosis 

25 Saturated solution 

26 Solute 

27 Solvent 

28 Spontaneous combustion 

29 Standard solution 
30. Zymase 


a Elements which possess in some degree the physical properties of metals 

and the chemical properties of nonmetals ( ) a. 

b Characterized by one or more six carbon atom rings . . . . ( ) b. 

c A compound composed of a negative ion other than hydroxyl or oxygen 

m combmation with some positive ion other than hydrogen . . . . ( ) c. 

d. Characterized by an open chain of carbon atoms ( ) d 

e A mixture of mercury and one or more other metals ( ) e. 

f Process by which sucrose is changed into a mixture of equal parts of glu- 
cose and fructose . . . . ( ) f 

g Charactenstic of substances which are not combined with water . . ( ) g. 

h A group of elements which characterize the organic aads . . . ( ) h. 

i Burning produced by slow oxidation and the accumulation of heat . ( ) i. 

j Smallest unit of a substance which takes part m ordmary chemical 

changes , . , . . . . . . ••••()!• 

k The temperature at which a substance begins to glow or bursts mto a 

flame ( Jh 
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ILLUS 12 COOPERATIVE CHEMISTRY ^’EST (ConPd) 

C. Appiica^ton of Pjrinopies 

Dircciiom’ In each of the following exerases a problem is given. Below each prob- 
lem are two lists of statements The first list contains statements which can be 
used to answer the problem. Place a plus sign (+) m the parentheses after the 
statement or statements which answer ^e problem The second hst contams 
statements which can be used to explain the right answ-ers Place a plus sign (+) 
in the parentheses after the statement or statements which give the reasons for 
the nght answers Some of the other statements are true but do not explam the 
right answers, do not check these In doing these exercises then, you are to place 
a plus sign (-h) m the parentheses after the statements which answer the problem 
and which give the reasons for the RIGHT answers 

Sample: Coal gas which has not been previously mixed with air is burned at a gas 
jet. At another similar gas jet the coal gas is mixed with air before it is burned 
Win there be any difference m the amount of light given off by the flames of the 
two gas jets? Why? If a cool aluminum pan is placed over each flame will there 
be any difference m the amount of soot deposited on the pan m the two cases? 
Why? 

The flame at the first gas jet will give off: 

a More light than the flame at the second gas jet ( -f ) a. 

b. The same amount of light as the flame at the second gas jet . • . ( ) b 

c. Less light than the flame at the second jet ( ) c. 

The soot deposited by the first gas jet will be 

d More than that deposited by the second gas jet ( 4* ) d 

e. Less than that deposited by the second gas jet ( ) e 

Check the following statements which give the reason for the answer or answers 
you checked above, 

f . Incomplete combustion leaves some uncombined carbon in the flame . ( ) f 

g. The presence of nitrogen retards combustion ( ) g 

h. Particles of uncombmed carbon glow when heated . ( H- ) h, 

i. Combustion is more complete in the first flame ... . . ( ) i. 

j. The amount of air mixed with the gas does not affect the amount of 

light produced by the bummg gas .... . ( ) j. 

k. Some uncombined carbon in a flame is deposited on a cool surface placed 

in a flame ..,( + )k 

In the above exerase, the statements which answer the problem are a and ^ These 
statements are checked because they tell what would be likely to happen State- 
ments ^ and k have also been checked because they are reasons which help to 
explam why a and d would happen. You will notice that statement g is a true state- 
ment but it does not give a reason for any of the right answers, hence it is not 
checked The other statements are not true and are not checked Do each exer- 
cise in this way 

(Samples from Hendricks et al , 1934 By permission of the Cooperative 
Test Service, Inc ) 
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plan in order to find’ if it is the correct one. If it solves the problem, 
he scores a point; if it does not, he must discard it and seek another 
solution. Illustrations of this type of test are seen in (a) mathemati- 
cal problems (Ulus. 70 and Ulus. 73), (b) the assembly of apparatus 
(Ulus. 13), (c) pencil mazes (Illus. 54) in which a person can see the 

ILLUS. 13. MINNESOTA MECHANICAL ASSEMBLY TEST MATERIALS 



Top, Box A; Center, Box B; Bottom, Box C 
(Paterson et al., 1930. By permission of the Marietta Apparatus Co.) 


whole maze at once and must try to evolve a plan for finding a way 
through it, and (d) many verbal situations in which one must try 
out several hypotheses. The unit of measurement is a problem or a 
part of the problem to be solved. Sometimes the method of solution 
and the time used are also noted. 

Reasoning is common in ordinary life, but reasoning ability is 
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very difficult to measure on a quantitative basis^for different persons 
have different amounts of information to aid them in the solution 
of a problem Some persons may know the solution before the prob- 
lem IS presented. In order to evaluate reasoning activities it is neces- 
sary to eliminate diiOEerences in relevant information. This elimina- 
tion may be accomplished with fair success in verbal tests by limiting 
the vocabulary to words familiar to all in the group being tested 
Knowledge Knowledge tests require recall or recognition of 
verbal or other materials. In many ways these tests are the most 
satisfactory of those developed, because knowledge is easily tested 
and scored and is an important predictor of success in many fields. 
The most structured are the recognition tests, in which the examinee 
chooses the most appropriate answer from two or more presented. 
This type is shown in Illus. 12, Sec A, in which statements are to be 

lULUS H LATHE ITEAfS FROM THE DETROIT MECHANICAL 
APTITUDES TEST FOR BOYS 


Z1 CARRIES BELT- ( ) 

2S ADJUSTS H610HT OF REST ( ) 

24 SUPPORTS TOOL ( ) 

25 OILS BEARINO ( ) 

26 FASTENS TAIL STOCK CENTER-( ) 

27 REVOLVES WORK ( ) 

26 ADJUSTS TAIL STOCK C£NTER-( ) 

29 MOLDS WORK ( ) 

30 FASTENS TAIL STOCK ( ) 

31 HOLDS DRILL ( ) 

32 HOLDS .EMERY WHEEL ( ) 

(Baker and Crockett, 1928. By permission of the authois and the Public 
School Publishing Co ) 

marked true or not marked at all, and by Ulus 12, Sec. B, in which 
one of the thirty answers is to be chosen. All of these items cquld also 
be cast into a recall type of test, in which the examinee is to supply 
the answer. Illustration 147 is a completion test in which a particular 
word must be recalled to complete a sentence. 

Knowledge tests are not limited to words; many are in good pic- 
torial forms Picture-naming tests are widely used for appraisal of 
preschool and mechanical vocabularies. When knowledge of the 
use or the motion of parts of a machine is to be tested, picture tests 
are often superior to word tests Illustration 14 presents a lathe, the 
parts of which are numbered. The subject is asked to indicate which 
parts perform certain functions In almost all knowledge tests the 
unit of measurement is an item correctly answered, and the raw 
score IS the total number of correct answers. Sometimes corrections 
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for chance successes sure applied as described in Chapter IV. Informa- 
tion items form the chief component of many achievement, intelli- 
gence, and aptitude tests. 

Motivation (Goal Seeking) 

Motivation includes all goal-seeking activities, which are usually 
grouped under needs, interests, and attitudes, or sentiments. No 
sharp lines are to be drawn between these three terms, but needs are 
often related to basic physiological processes, interests to specific per- 
sonal goals, and attitudes or sentiments to broad generalized ideals 
for society Principal needs — those essential to survival — are for food, 
air, light, warmth, sex, exercise, avoidance of pain, and sleep. These 
are always determined to some extent by inheritance, but a particu- 
lar seeking activity also depends somewhat on the environment. In- 
terest and attitudes seem to depend mainly upon cultural ideals: 
social, political, religious, vocational, artistic, and recreational. 
Strictly speaking there are no good measures of goal-seeking activities, 
but appraisals are made by time sampling, questionnaires, ratings, 
logs, interviews, case histories, and projective techniques. 

For the evaluation of typical attitudes and interests, a large num- 
ber of rating scales or inventories have been developed, some of 
which use self-ratings and some the ratings of others Self-ratings are 
considered to be among the best ways of evaluating interests and at- 
titudes of adults Illustration 15, in which attitudes toward the church 
are to be appraised, and Ulus. 16, in which a person expresses an 
artistic preference among designs, are self-rating tests The unit of 
measurement is based upon a judgment of like, indifference, or 
dislike for a particular activity, or upon a choice between items. Raw 
scores show total likes or dislikes and also particular regions of inter- 
est. 

Self-ratings are often considered to be invalid, owing to the fact 
that It is difficult for a person to evaluate his feelings accurately, and 
to the fact that he may purposely falsify his report in order to bring 
the results more into line with what is soaally acceptable. Ratings 
are also affected by ambiguities in definitions, as when persons are 
asked to evaluate rather complex traits, such as tact or reasoning 
ability. Lastly, ratings are sometimes subject to halo effects, which 
occur when one general classification, such as attractive or unattrac- 
tive, influences the rater’s judgment on a large number of other traits 
of a more specific nature, for example, neatness, promptness, kind- 
ness, and intelligence. The refinement of ratings is discussed in 
Chapter XVI. 
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ILLUS. 15 ATTITUDE TOWARD THB(‘CHURCH 

Ch£ck every statement with which you fully agree • 

1 I think the church is a divine institution, and it commands my highest loyalty 
and respect 

3, I feel the good done by the church is not worth the money and energy spent 
on it 

5 I believe that the church is losmg ground as education advances. 

7 The teaching of the church is altogether too superficial to be of mterest to me 

9 I tlnnlr the church has a most important mfluence m the development of moral 
habits and attitudes 

11. I regard the church as a harmful institution, breedmg narrow mmdedness, 
fanaticism, intolerance 

13. I beheve m the ideals of my church, but I am tired of its denommationahsm 

15. Fm not much against the church, but when I cannot agree with its leaders I 
stay away 

17. I beheve that the church practices the Golden Rule fairly well and has a con- 
sequent good influence 

19. I feel the church is ndiculous, for it cannot give examples of what it preaches 

21. My church is the primary guiding influence m my life 

23. My attitude toward the church is one of neglect due to lack of interest. 

25. I am sympathetic toward the church, but I am not active m its work. 

27. I know too httle about any church to express an opinion. 

29. I am slightly prejudiced against the church and attend only on special occa- 
dons 

31 There is much wrong in my church, but I feel it is so important that it is my 
duty to help improve it. 

33. I think the church is unreservedly stupid and futile 

35. I feel the church is petty, easily disturbed by matters of httle importance. 

37. I believe the church is non-saentific, d^ending for its influence upon fear of 
God and helL 

39 It seems absurd to me for a thinking man to be mterested m the church 

41 I beheve that anyone who will work m a modem church will appreciate its in- 
dispensable value, 

43. My attitude toward the church is passive, with a slight tendency to disfavor it. 

45 I have nothmg but contempt for the church 

(Abbreviated from Thurstone and Ghave, 1929, p 23. By permission of 
The University of Chicago Press.) 


Integrative Traits 

Integrative traits are ways in which energy is directed in working 
or playing, or in meeting problems or conflicts. One person may run 
away from a serious automobile accident, another may ignore it and 
continue what he is doing, no matter how foolish; another may 
imagine the situation is different from what it really is, another may 
explode into aimless screaming, while still another may take sensible 
steps to give first aid All except the last of these ways of behaving, 
when carried to extremes, are typical of various abnormalities or 
insanities, and indicate poor integration 
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A normal person n&st somehow reconcile various conflicting drives 
and use his skills to advance toward desirable goals. This process 
of controlling or channeling drives is best measured by observations 
and by projective tests. In projective tests the stimulus is purposely 
left vague, and the directions encourage one to indicate by word or 
movement whatever the stimulus suggests One often shows his own 
deeper or unconscious wishes or fears, and the degree of inner con- 
flict by associations and fantasies. Thus, in the Kent-Rosanoff Free 
Association Test (Ulus. 17) a per- 
son is asked to respond to a stim- ILLUS 16 SAMPLE OF THE 
ulus word with the first word ^’^TISTIC PREFERENCE test 



aspect of behavior. Thus objective-type test scores are definitely in- 
fluenced by one’s attitude toward the tests and his emotional bal- 
ance, Self-ratings are decidedly ihfiuenced by ability to discriminate 
accurately, and projective techniques always yield evidence of think- 
ing efiBciency as well as of motives and of integration. Tests are classi- 
fied according to the activities which seem most often represented by 
the scores. 
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r 

ILLUS 17. THE KENT-ROSANOFF FREE ASSOCIATION WORDS 


26 Wish 


51 Stem 

52 Lamp 


76, Bitter 
77 Hammer 


1 Table 

2 Dark 
3. Music 

4 Sickness 

5 Man 
6. Deep 
7 Soft 
8. Eating 

9 Mountam 
10 House 
11. Black 
12 Mutton 

13. Comfort 

14. Hand 

15 Short 

16 Fruit 

17 Butterfly 

18 Smooth 

19 Command 

20. Chair 

21. Sweet 
22 Whistle 

23. Woman 

24. Cold 

25. Slow 
(Rosanoflf, 1920. 


27. River 
28 White 
29. Beautiful 

30 Window 

31 Rough 

32 Citizen 

33 Foot 
34. Spider 
35 Needle 

36. Red 

37. Sleep 

38. Anger 

39. Carpet 
40 Girl 

41. High 

42. Working 

43 Sour 

44 Earth 

45 Trouble 

46 Soldier 

47 Cabbage 

48 Hard 

49 Eagle 
50, Stomach 


53 Dream 

54 Yellow 
55. Bread 
56 Justice 
57. Boy 
58 Light 
59. Health 
60 Bible 

61. Mentoiy 

62. Sheep 

63 Bath 

64 Cottage 
65. Swift 

66 Blue 

67 Hungry 

68 Priest 
69. Ocean 

70 Head 

71 Stove 

72 Long 

73 Religion 

74 Whiskey 

75 Child 


78 Thirsty 
79. Oty 

80 Square 

81 Butter 

82 Doctor 

83 Loud 

84 Thief 

85 Lion 

86. Joy 

87. Bed 

88 Heavy 

89 Tobacco 

90 Baby 

91 Moon 

92 Scissors 

93 Quiet 

94 Green 

95 Salt 

96 Street 

97 King 

98 Cheese 

99 Blossom 
100, Afraid 


By permission of John Wiley and Sons, Inc., New York.) 
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irJfcLUSrVE TECHNIQUES 

Other techniques which yield valuable estimates of many aspects 
of a person are time sampling, logs, interviews, and case histories. 

Time Sampling 

Observations of uncontrolled situations are generally made by 
what is known as the time-sampling method Observers take samples 
of performance at regular intervals during a day or over longer pe- 
riods of time. The results show the frequency with which types of 
behavior patterns appear in certain situations. One disadvantage of 
this method is the difficulty of getting observers to agree upon re- 
ports of what they have seen. Even when the observers have had con- 
siderable training, marked discrepancies sometimes appear in their 
reports. When the reports show that the observers do agree, it may 
be that in some instances they have eliminated the controversial 
data upon which they could not agree. Such reports are not complete 
and may fail to show true relationships. This same difficulty is 
presented in all kinds of observation, but it seems to be more serious 
m time sampling, with the preconceived ideas of the observer serving 
to direct the attention toward certain types of behavior. In spite of 
these difficulties, Jersild and Meigs (1939) have summarized studies 
where the results are consistent and fairly complete. The unit of 
measurement is usually the noticed type of behavior, recorded as 
operating at one of the regular periods of sampling. The raw score 
is the number and duration of the periods of the activities under con- 
sideration (Chapter XXIV). 

Logs 

Another method of recording behavior is by use of a log. This dif- 
fers from direct observation in that the observer records pertinent 
outstanding events over a considerable period, for example, an hour 
or a whole day In many cases outstanding events are significant of 
behavior, and for certain purposes logs are valuable. Teachers' and 
camp counselors' logs are often used to check results of tests and rat- 
ings. 

One disadvantage of this method is that it is dependent upon emo- 
tional and subjective variables in the observer. If the observer is 
feeling particularly well and happy, he will probably not record as 
many annoying episodes as when he is suffering from a severe head- 
ache. Unless the log is recorded regularly and with considerable 
attention given to the methods of recording, it is extremely difficult 
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to summarize- In many instances logs have proved to be so vague as 
to be useless for comparing persons or groups. 

Interviews 

Because interviewing represents an exceedingly adaptable method 
of securing data, it is more useful than some of the other methods. 
It allows the interviewer to ask questions in such a way as to secure 
the confidence of the person who is being interviewed. Confidence 
is difficult to obtain by means of written or oral tests when individuals 
are on guard and reluctant to show exactly what they can do or what 
they are thinking With a well-defined outline for interviewing, 
many aspects of behavior can be systematically covered and a fair 
appraisal made in a short time. 

Interviews, however, may m addition to appraising a person, give 
information and suggestions. It is often difficult to avoid suggesting, 
by leading questions, the answers which one wishes to get. Interviews 
have another disadvantage — that of being relatively unstandardized. 
The results are difficult to handle numerically unless a standard rat- 
ing technique is used. Interviews are used for nearly all types of 
employment, counseling, clinical, and psychoanalytical appraisals. 

Case Histories and Biographies 

Case histories and biographies are among the most useful kinds 
of records used in evaluating development over a long period. A 
careful case history will include: a record of family history, physical 
development, health, tests of progress in intellectual pursuits, social 
and economic adjustments, and emotional patterns of development. 
It will try to show how much these aspects are dependent upon en- 
vironment and how they are related to one another It is on the basis 
of careful case histories and tests that the most accurate predictions 
can be made. 

One disadvantage of the case-history mediod is the great difficulty 
of getting a true record of events which occurred several years pre- 
viously Case histories usually include rather vague memories of a 
person, his friends, parents, and teachers, which have been influenced 
by the forgetting and improvising processes Case histories of the 
same person made up by two investigators sometimes differ in impor- 
tant respects. The best case histories are those which have been started 
at an early age and thereafter continued by additions at regular pe- 
riods. 
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TYPES OF TESTS 
Formal and Informal Tests 

There are three main difiEerences between an informal test and a 
formal or standardized test. (1) The formal test has more rigidly 
standardized directions for administration and scoring which come 
from the revision of many preliminary trials. (2) The context of 
a formal test has been more thoroughly scrutinized to include only 
important facts and skills, and to eliminate ambiguities and chance 
factors. (3) The formal test has usually been more widely applied, so 
that norms are available for many persons in various age, grade, or 
occupational groups. 

While these differences are all in favor of the formal test, still an 
informal test is often preferred because it can be constructed to fit 
the needs of a particular situation more aptly. Such a situation oc- 
curs so frequently in industry and in Progressive schools that em- 
ployment specialists and teachers in these organizations should be 
prepared to construct their own tests. Chapter IV deals with the 
construction of test items. 

Achievement, Aptitude, and Psychological Tests 

Mental tests may be classified into three groups according to their 
chief uses: (a) to measure present ability — ^achievement tests; (b) to 
predict success — ^aptitude tests; (c) to diagnose behavior — tests of 
psychological processes. 

Achievement Tests, Achievement tests are designed to measure 
skills and information which have been learned, either in particular 
courses of training or from experience elsewhere. There are widely 
used tests for nearly all school subjects at various grade levels (Chap- 
ter VII). Test items for achievement tests are generally selected from 
those which have been used by teachers as indications of success in 
a course of instruction. For example, if a large group of competent 
teachers agree that the data for a test item have been taught in a 
course on American history, then that item is considered relevant for 
a test of this subject The first criterion of exclusion or inclusion then 
is the agreement of competent judges on its relevancy. 

A group of Progressive educators has set up ideals which are much 
more inclusive than the mastery of certain subject matter. A typical 
list of these ideals is given by Raths (1938, p, 90) who summarized the 
work of the evaluation staff of the Commission on the Relation of 
School and College, of the Progressive Education Association, as fol- 
lows; 
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1. Improved habits and abilities with relation tB reflective thinking 

2. Wider and riciier interests 

3. An increasing consistency in important attitudes 

4. Increasing facilities to adjust socially 

5. Developing creativeness 
6 Improved study skills and work habits 
7- An increasing stock of vital information 

8. A wider and better appreciation of literature, the arts, and music 

9. A developing sensitivity to socially significant problems 

10. A functional philosophy of life 

The construction of precise instruments to measure progress to- 
ward these ten objectives is a gigantic task, but operational definitions 
have been provided for most of these ideals and marked progress is 
recorded in various chapters in Part III. 

Aptitude Tests, A number of tests have been called aptitude tests 
by their authors. Inspection of these shows that they demand similar 
skills and information as achievement and psychological tests Often 
the same items are found in all three types Although there is no 
agreement among psychologists on one technical definition of the 
word aptitude, the following three are fairly common: 

Aptitude tests are any tests which happen to predict later success to some 
degree It must be demonstrated that high standing on the test will indicate 
great success after some specialized training. This point of view is voiced 
by Bingham (1937) and by other authors 

According to another definition aptitude tests are those measures of 
achievement which will predict some future development, but which will 
not themselves increase or change significantly with any furtlier develop- 
ment. This definition is used by Paterson and Darley (1936), who offer the 
Minnesota Clerical Test (Ulus 8) This type of definition generally im- 
plies that all persons tested have, through previous maturation and expe- 
rience, reached fairly high levels of proficiency in the skills tested Hence, 
those with high scores have demonstrated their ability to develop m work 
of a similar nature. 

The third type of aptitude indicator is not a single score, but a series of 
scores arranged in a curve of individual development. This indicator is 
seldom used, for it is difficult to secure records for a number of persons, 
which are comparable in amounts of practice and motivation. For the most 
careful predictions, developmental curves would certainly be much better 
than one test score. Because of this, there is a strong emphasis in mental 
measurements today upon securing curves of long-time development. 

When a test is to be used to predict success in an occupational or 
professional school, it should, according to one theory, include items 
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these items about thff same weight in the test score that they have in 
producing success- It has, however, been found difficult to decide 
which factors are important in the development of occupational suc- 
cesses. In order to construct an aptitude test which will give good 
prediction, a program must be followed which will 

1. Select a fairly large variety of items which seem to have some 
predictive value. The selection of items is usually aided by a 
careful analysis of the skills of those experienced in an occupa- 
tion. 

2. Apply these tests to a large number of persons who are just 
beginning training for an occupation or profession. 

3. At yearly periods after training has been finished, secure ratings 
or other indications of occupational success. 

4. Compare the ratings of success with the various test scores. 

5. Combine the test scores to give the best prediction of success. 
Often a number of the original tests are eliminated before a 
final combination of tests for an aptitude scale is made. 

This procedure has been followed with good results in certain indus- 
trial and school situations 

Psychological Tests, These tests are designed to appraise traits 
which have significance in a psychological analysis. They include 
tests of intelligence, tests of mental abilities, such as perception, rea- 
soning, and learning, and tests of motivation and integration, such 
as interest, attitudes, and patterns of adjustment. 

Altitude, Speed, and Breadth Tests 

The three types of tests just described may be reclassified according 
to altitude, speed, and breadth. 

Altitude Tests, Altitude tests, sometimes called power tests, allow 
a liberal amount of time so that all persons tested will attempt 
nearly all the items which they could possibly pass. The scores of 
an altitude test yield an indicator of the highest level which a per- 
son achieves. The best altitude tests are constructed from carefully 
scaled items which increase in complexity as the test progresses. 

Speed Tests, Speed tests, which are also called rate tests, are of 
two kinds. One, called a time-limit test, is composed of items which 
are all of similar difficulty, as in the name-comparison test (Ulus. 8), 
and the aiming test (Ulus. 4). A time limit is used which is so short 
that none of the group can finish the test. The second, called a work- 
limit test, requires a standard series of operations, and the score is the 
time needed to complete the operations at a given level of excellence. 
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This is illustrated by an assembly or a form-board test where mate- 
rials are to be put together to foim a pattern (Ulus. 45 through Ulus. 
50). 

Breadth Tests, Breadth tests are designed to measure one's range 
or variety of skills or information Great speed is not required, and 
complexity is fairly constant A good illustration ot this type is the 
Cooperative Test of Current Events, which samples a person's knowl- 
edge of recent developments in five fields: science, economics, foreign 
affairs, arts, and recreation. 

Many tests are combinations of altitude, rate, and breadth items; 
but mixed tests are not desirable, for they do not allow clear inter- 
pretation of results. For instance, when an altitude test is given with 
rather short time limits, two persons may receive the same score for 
different reasons One may be a fast yet superficial worker who makes 
his maximum score during the time allowed Another person may 
have greater ability, but work so cautiously that he is just getting 
warmed up when time is called By similar logic items which are 
carefully graded for complexity of thought for an altitude test should 
not be mixed with those which are simply rare, although rare items 
may be appropriate in a breadth test Likewise, the breadth items 
should not be given with short time limits, since the mam point of 
a breadth test is to enumerate the various facts which a person knows. 
Frequently one finds interesting temperament or motivation indices 
in the relative speed at which persons work, so that a record of this 
factor in any kind of test is often desirable. 

The usual method of establishing time limits involves trying out 
items upon typical groups of persons. This method is employed with 
all well-standardized tests Altitude tests have been found to give 
adequate samples of ability if enough time is given to allow approxi- 
mately 90 per cent of the group to attempt every item. This is gen- 
erally the case when the tests depend to a large degree upon informa- 
tion; when the items have been arranged in order of difficulty, and 
when the range of difficulty is such that at least 5 per cent of the 
group can succeed on the hardest items and 5 per cent will fail on 
the easiest items. 

Individual and Group Tests 

Both individual and group tests are available for many types of 
behavior. Individual testing situations are usually better controlled 
than group situations, and are more useful in the following situa- 
tions: 

1, When oral responses are needed, such as in reading aloud and 
answering questions. 
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2. When subjects Wo not readily follow directions or cooperate. 
(This would generally be true of persons below eight years of 
age, and of defective and maladjusted persons ) 

3. When forms or instruments, such as form boards, puzzles, and 
machine assembly, are to be manipulated. 

4. When it is desirable to evaluate the subject’s adjustment to the 
test situation, such as his persistence, cautiousness, emotional 
episodes of anger or fear, teasing, balking, or nervous habits. 

5. Whenever it is desirable and possible to evaluate the subject’s 
methods of work, such as the use of random movements, me- 
thodical comparisons, and reasoning. 

Group tests are more useful than individual tests: 

1. When it IS desirable to avoid close personal relations with an 
examiner. 

2. When it is desirable to compare the efiects of group stimulation 
with the effects of isolation. 

3. When the subjects can and will cooperate well. 

4. When the results of group tests have proved to be accurate 
enough to be useful in a particular situation. (Many group tests 
of written work seem to be as effective as individual tests above 
the 8-year level.) 

5. When economy of time and effort is important. 

STUDY GUIDE QUESTIONS 

1. What are the relative advantages of measures which strive to evaluate 
one unitary trait as compared with measures of general mental ability? 

2 How are the five major categories of traits distinguished from one 
another? 

3 What are the usual units of measurement for aiming, steadiness, and 
dexterity tests? 

4 What is essential activity in all perceptual tests? 

5. How are immediate-memory-span tests scored? 

6 Why are direct measures of learning very rare? 

7 What are the mam essentials of a good reasoning test? 

8. Why are knowledge tests the best developed at present? 

9. What are the advantages and disadvantages of self-ratings? 

10 What IS the procedure in projective techniques? 

11. What differences are there between achievement and aptitude tests? 

12- Distinguish altitude, speed, and breadth tests 

13 What are the relative advantages of individual and group tests? 



CHAPTER III 


CHARACTERISTICS OF A 
GOOD INSTRUMENT 




Few persons will undertake to prepare a well-standardized test, but 
many wish to know the theoretical difiEerences between satisfaaory 
and unsatisfactory tests, and what can be considered sound evidence 
of these differences. This chapter ^ gives certain logical and statistical 
ways of defining and expressing the characteristics of a satisfactory 
measuring instrument. Three characteristics are commonly attrib- 
uted to such tests: practical aspects, reliability, and uniqueness. 

PRACTICAL ASPECTS 

For practical purposes a good testing instrument requires mini- 
mum time, cost, and effort in its administration, scoring, and inter- 
pretation, It is clear that, as the testing situation varies, the relative 
importance of these aspects will change Moreover a good testing in- 
strument will allow a good sampling of the abilities of a person with- 
out disturbing or embarrassing him, unless, as is sometimes the case, 
the test is designed to appraise one’s behavior in a stress situation. 
In order to avoid coaching but to provide for growth studies, two 
or more equivalent forms should be available. The practical re- 
quirements of a satisfactory test include the following* 

a. Optimum difl&culty of items 

b. Scoring 

1 The last half of this chapter beginning with page 50 is difficult to understand 
without a fair background of elementary statistics, such as is found in Chapters 
XII and XIII 
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1 Objective-type tests 

2. Efl&cient scoring devices 

c. Adequate interpretation 

1. Adequate and easily understood norms to use in comparing 
a person with others of his own age, sex, and status 

2. Validity: prediction of success 

d. Ease of administration 

1. Short time-allowance 

2. Little or no supervision, except for recorded observations 

3. Simple, clear directions 

4. Minimum of materials 

5. Two or more equivalent forms 

6. Situations of interest to the examinee 

Some of these requirements conflict with the others. For instance, 
great economy of time given to administration usually reduces the 
significance of the results. Since the administrative aspects are dealt 
with in many other places in this book, only the first five aspects will 
be discussed here. 

Optimum Difficulty of Items 

Difficulty of an item for a given group is usually determined by 
the percentage of persons in the group who succeed on that item. 
A group of Items is considered to have only one level of difficulty 
when all of the items are passed by about the same proportion of a 
given group of persons. In test construction the levels of difiiculty 
that shall be included always constitute a problem. If one desires to 
measure each individual in a group with great accuracy, then items 
from all levels of difficulty must be included in the test in sufficient 
numbers to avoid chance effects. However, if one desires merely to 
divide a group of persons into a small number of divisions, then only 
those Items are needed which will indicate the division boundaries. 
The simplest situation is one in which two persons are to be placed — 
one in the upper and the other in the lower half. Since the items failed 
or passed by both persons would not serve to distinguish between 
them, the optimum difficulty would be represented by those items 
which one person passed and the other failed. A more complex situa- 
tion would be one in which a large group is to be separated into an 
upper and a lower half. In this case the only items needed are those 
which most clearly differentiate among the persons of medium abil- 
ity. Such items would be neither the easiest, for they would distin- 
guish only among the least able persons, nor the hardest, for they 
would discriminate only among the most able. The items passed by 
50 per cent of the group would most adequately separate the group 
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into halves. Where test items have been applieS to single-age groups 
in particular environments, it has been possible to construct scales 
with definite degrees o£ difiiculty with considerable accuracy. 

Scoring 

A good test is so designed as to make its scoring as simple as pos- 
sible. For many tests o£ mental ability, coordination, and preference, 
scoring can be made automatic. For the more variable projective 
techniques, however, the judgment of the scorer is very important. 
With the latter, scoring can be made more accurate and uniform by 
training the scorers m precise concepts, operational definitions, and 
samples. 

Objective-type Tests. This term refers only to tests that can be 
objectively scored after they have been administered If all persons 
who score a test arrive at exactly the same score, the procedure has 
been objective Multiple-choice and completion-type items are often 
called objective-type tests, because the scoring is done by using a 
predetermined key. However, there is no guarantee that the key rep- 
resents widespread agreement of competent judges. Many tests con- 
tain ambiguous and controversial items which reflect a subjective bias 
of the author. If the scorers must use their own judgment in deter- 
mining the significance of an answer, and if they disagree because of 
differences in standards, then the scoring is considered somewhat 
subjective. Objectivity in scoring is, of course, highly desirable, for 
it is only by attaining high objectivity that wide standardization and 
careful analysis of items are possible. 

Efficient Scoring Devices. The design of an item determines both 
speed and accuracy in scoring the answer. Ordinarily, the two proc- 
esses which must be completed to obtain a test score are checking the 
answers with a key and counting the correct answers to get the total. 
Errors and omissions must also be counted if the total score is to be 
corrected for chance successes. 

In essay- or completion-type lists the key consists of a list of cor- 
rect answers which must be compared with those the subject has 
written. There are no mechanical short cuts to scoring such tests, but 
scoring can be speeded up a little in completion-type tests by having 
all answers placed in a column where they can be easily compared 
with a key. 

For true-false, multiple-choice, or matching tests three scoring de- 
vices are used: stencils, automatic devices, and machine scorers. Each 
results in a great saving of time and in increased accuracy, especially 
when large numbers of tests are to be scored. No simple rule can be 
laid down for effecting economies in scoring, for there are at least 
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three variables: cost «)£ clerical labor, cost o£ devising and printing 
automatic scoring keys, and cost o£ operating the machine. 

Stencil scoring. In scoring true-false and multiple-choice items 
a printed stencil or scoring key is sometimes placed over or beside 
the test answers. When this has been done a clerk can quickly see 
which answers are correct. When a large number of ^sts must be 
scored, the examinee may be required to indicate his answers on a 
separate answer sheet. The marked answer sheets are then run 
through a machine which prints the correct answers with colored 
ink near the subject's answers, which can then be rapidly checked. 

Automatic scoring devices. One of the most difficult steps in 
stencil scoring — checking answers against a key — is eliminated by 
automatic scoring For example, a fairly large number of tests are 
printed in such a way that the subject’s mark on the test will indicate 
the correctness of his answer on a key, which is placed where the 
examinee cannot see it while taking the test. 

Many tests have a key composed of small squares printed on the 
back of the answer sheet or on a sheet just beneath the answer sheet. 
The key is concealed during the test, and revealed for scoring by 
breaking the glued edges apart. A thin coating of carbon is so placed 
that when the subject makes an X with an ordinary pencil to indicate 
his choice, the X is impressed on the key sheet. To find the number 
of correct answers, one simply counts the squares on the key sheet 
which have X’s in them. 

Toops (1937) used a sharp-pointed stylus or large pin for the 
automatic scoring of the Ohio State University College Entrance Ex- 
amination. Test booklets and answer sheets are printed separately. 
The answer sheets contain rows of numbered squares. A key sheet, 
just beneath the answer sheet, has a square printed under each cor- 
rect answer. The person being tested makes a pinhole in the square 
which corresponds to his choice, and a pinhole is simultaneously 
made in the key sheet. Several key sheets can be used at once when 
duplicate records are desired. A number of authors have adopted 
this method 

Although these automatic scoring devices do not furnish a total 
score they do allow the scores to be counted directly from the marks 
made on a key by the person who takes the test. Accuracy and speed 
of scoring are thus greatly increased. 

To promote student learning by immediately revealing whether 
test questions have been answered correctly, Troyer and Angell (1950) 
published a series of answer sheets. These can be used for any tests 
that are composed to correspond to a prearranged key. Two sets of 
keys are available for (a) 300 2-choice items to a page, or (6) 150 4- 
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choice or 5-choice items to a page, or (c) 210 2f choice and multiple- 
choice Items to a page. Since a number of studies have shown that 
certain lands of rote learmng take place more quickly when correct 
answers are provided, this type of answer sheet should receive senous 
consideration. It is more economical and- simpler than the several 
mechanical devices that have been marketed to indicate immediately 
the correctness of an answer 

Machine scaling. The International Business Machines (IBM) 
Scorer (1938) uses carefully printed sheets, such as that shown in 
Illus. 18, upon which the person marks all his answers with a special 
pencil As can be seen, tlie sheet is printed with small parallel lines 

ILLUS 18 MECH.A.NIC^LLY SCORED ANSWER SHEET 
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showing where the pAicil marks should be placed in indicating true 
items, false items, or multiple choices* To score this sheet, it is in- 
serted in the machine, a lever is moved, and the total score is read 
from a dial The scoring is accomplished by electrical contacts ^\ith 
the pencil marks Each sheet can be scored for ten separate divisions, 
as well as for the total. Corrections for guessing can be obtained by 
setting a dial on the machine. By this method three hundred true- 
false Items can be scored simultaneously. The sheets can be lun 
through the machine as quickly as the operator can insert them and 
write down the scores The operator needs little special training 
beyond that of a clerical worker Another device tabulates the num- 
ber of times each item is correctly answered. 

The International Business Machines Corporation has also devel- 
oped a method of marking answers or ratings directly on a stifl 3^4- 
by 7^/^-inch card. The marks are pundred in the card by a machine, 
and can be used for immediate tabulation and item analysis 

Adequate Interpretation 

It is often much easier to administer and score a test than it is to 
interpret the results. There are three usual means of interpretations, 
namely, a person’s stand in a group is indicated, or his probable 
chance of success in a particular situation is predicted, or some 
aspects of his personality can be inferred from the pattern of scores. 

Place in a Group. One’s place in a group is usually shown by 
a centile, which is a number from 0 to 100 which indicates the pro- 
portion of a group which a person surpasses. Other indicators arc 
standard scores, T-scores, and variations of these, which are explained 
in Chapter XII. For growing persons mental age (MA), intelligence 
quotient (IQ), and educational age (EA) are used to show level oi 
rate of growth (Chapter VI). One has a mental age of 10 if his score 
on a mental test is the same as that of the average ten-year-old Intel- 
ligence quotient is defined in several different ways, but usually it 
is one's mental age divided by his chronological age, with the quotient 
multiplied by 100 (to avoid decimal points) Roughly it shows one’s 
rate of growth. For instance, an IQ of 100 means that one is grow- 
ing at the same rate as the average person; an IQ of 150 means that 
one is accelerated 50 per cent, and an IQ of 70 indicates 30 per cent 
below normal. Educational age and educational quotient are based 
on school achievement tests rather than on general mental tests. One’s 
mental age and educational age often show differences which need to 
be investigated to determine why one is doing better or worse than 
might be expected. 

Letters (usually A, B, C, D, and E) are used a great deal in sdiools 
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to indicate the various places, or the steps of progress that the mem- 
bers of a group may take in a subject, or in industriousness, or a 
combination of these. Unless letters used as grades are clearly de- 
fined, however, they are difficult to interpret In large groups where 
there is likely to be a wide distribution of ability, letter grades are 
often given arbitrary values. For instance one college has determined 
that the distribution of grades for freshmen and sophomores shall 
be: A, 15 per cent, B, 30 per cent; C, 40 per cent, D, 10 per cent; E, 5 
per cent. Grades for large classes are expected to conform fairly well 
to these proportions 

Test norms show where one's place is in a group to which one be- 
longs or wants to join. A norm is a set of figures which show the 
distribution of scores of a group. Age groups are essential to the 
study of growth or senescence. The best norms use limited age groups 
— 1 or 2 months for small children, and 6 or more months for 
adolescents and adults. (See Chapters V and VI) Sex groups are 
studied when the two sexes show different patterns of responses. Sepa- 
rate norms for the sexes are now given in many achievement and per- 
sonality measures. Occupational group norms are often used for 
classification or employment tests, and school- or college-grade norms 
are used for achievement tests 

This variety of group norms is useful but somewhat confusing 
when the same person is given test scores and centiles for different 
groups. For example, one veteran’s record showed that he was in the 
70th centile of a large adult group on the Army General Classifica- 
tion Test, in the 40th centile of a group of freshmen engineers on an 
engineering aptitude test, and in the 90th centile of a group of high 
school boys on the Kuder Preference for Mechanical Work Test. 
Illustration 21 (page 58) shows the usual relation between groups 
of adults in frequency distributions of language- and number-ability 
tests. Scores are marked at the base of the chart, and the number of 
persons making each score is shown by the height of the vertical 
column above the score. It appears that the middle of the adult 
group is near the lower end of the distribution of scores of high 
school graduates, and the middle of the high school graduates group 
is near the lower end of the distribution of scores of the college 
graduates. Other sections of each group can be similarly related, as 
inspection of the chart shows. Charts of this kind are needed but are 
not available for many types of tests and preferences. 

Validity Prediction of Success. In order to give an indication of 
one’s probable chances of success in a given course of study or oc- 
cupation, the author of the test must furnish evidence that, in similar 
situations, the test scores were related to success to a certain degree. 
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The most common evMence is a correlation coefficient, which is some 
number between — 1 00 and +1.00. Plus one indicates a perfect pre- 
diction, 0 a chance relationship, and — la perfect negative relation. 
Numbers less than 1 00 indicate less than perfect correlation. (Chap- 
ter XIII describes how the numbers are calculated.) Success m school 
subjects can usually be predicted from the most appropriate tests 
by correlations as high as .50 to .75. Success in skilled trades and 
clerical work is not quite as well predicted, but it can be predicted 
much better than chance Correlations such as these are generally 
called validity coefficients. A correlation coefficient is not a percent- 
age, and it does not show any causal relation between patterns of 
behavior. It simply indicates coincidence of position in a group when 
persons in the group are placed by two different appraisals. The size 
of a coefficient has the following rough significance when the group 
IS large — 300 or more. For smaller groups the correlations are in 
general less significant (Ulus. 142). 

.95 to 1.00 The coincidence is nearly perfect. One type of success can 
be predicted from the other very well. A reliable test will 
predict a retest to this extent. 

.75 to .95 Good individual predictions can be made for most of 
the group, but there will be some divergence 
.50 to .75 These coefficients are not high enough to make good in- 
dividual predictions, because many who are below aver- 
age on one test will be above average on the other. The 
extremes of the group are predicted fairly well Coeffi- 
cients are useful for indicating group trends. 

.25 to 50 These coefficients are too low for individual use, but they 
roughly indicate group trends, and can be used to sup- 
plement other kinds of predictions. 

Oto 25 These coefficients are often not significantly different 
from zero 

It IS often inferred that a test has high validity because it looks as 
though It should. Validity of this type is called face validity. On this 
topic Guilford (1946, p 437) wrote: 

Even sophisticated judgment often goes astray on deasions as to what, 
a test measures. A test designed to measure common-sense judgment when 
factor analyzed turns out to be a test of mechanical experience A test de- 
signed as a reasoning test is found to be one of numerical facility, when 
analyzed A test of pilot interest proves to have some variance, indeed, in 
that factor, but it is stronger m variance for the verbal factor A test de- 
signed to test the ability to maintain orientation in space turns out to be 
primarily a measure of perceptual speed This list could be extended The 
moral of it is that in test construction and in job analysis, things are not 
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always what they seem. This is primarily because out categories of aptitudes 
and traits have been faulty. Empirically determined factors, on tlie other 
hand, when sufficiently well defined, seem to be stable and dependable, and 
they are amenable to direct observation once they have been brought to 
light This discussion does not necessarily argue against the use of “face 
validity" in tests Face validity makes tests more palatable to the public 
But face validity may have nothing whatever to do witli actual validity, and 
it should be remembered that the problem of actual validity is never solved 
just because a test has face validity. 

RELIABILITY 

The characteristic which makes a test yield the same results when 
applied a number of times is called reliability. Reliability is highly 
desirable, for prediction is impossible or very difficult if, through 
practice or chance, a person gets a radically different score on a 
repetition of the same test. For two trials of the same test on a large 
group, reliability is best indicated by a correlation coefficient When 
the test cannot be repeated, there are a number of other methods that 
can be used fairly successfully as substitutes. Four coefficients are com- 
monly used, retest reliability, split-half reliability, equivalent-form 
reliability, and Kuder-Richardson estimates (Chapter X). Since these 
procedures sometimes give different results, they should not be con- 
fused. In all cases a high coefficient is considered an indication of 
small random variations, and a low coefficient indicates either ran- 
dom or systematic variations or both. 

Retest reliability is secured by giving a group the same test twice 
within a few days When it is difficult to secure two tests of the same 
group, a split-half reliability may be secured by correlating scores on 
one half of a test with scores on the other half Often the scores for 
odd-numbered items are correlated with the scores for even-num- 
bered Items. Correlations from the halves of a test, given on the same 
day, are frequently from .05 to .10 points higher than correlations 
from the same halves given a few days apart. The differences are due 
in part to changes in attitudes or conditions of the persons tested. 

The correlation between halves of a test indicates only the varia- 
jtions between the halves. Since the test is to be used as a whole, it is 
desirable to know the probable reliability of the test as a whole. This 
can be calculated from the Brown-Spearman (Spearman, 1910) for- 
mula for prophesying the probable effect of lengthening a test. The 
self-correlation of two tests which are made n times as long as they 
were originally, where r^i is the known correlation of the two origi- 
nals, would be: nrn 

1 + (n - l)r„ 
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For example, if the ttfo halves of a test correlate 60 with each other, 
the whole test would probably con elate .75 lAith another test of 
similar length and design, that is: 

^ 2 X .60 ^ ^ ^ 

1 + (1 X 60) 1.60 * ^ 

This ioimula may be used to find how long a test must be made to 
achieve a particular sell -correlation 01 coiuse, the iormula would 
not apply if, bv lengthening a te^t one introduced new factors, such 
as latigiie, loss o( interest, new soits of itenib, oi piacticc on the part 
of individuals 

The cquivalent-foim reliabihry coefhrient is secured by correlating 
tiv’o forms ol a test that aic intended to sample the same ability. 
Et[Ui\alenl-lonn icliabjlitiCi are usuall) lioin 05 to 10 points smaller 
than ictest reliabilities, loi equi\alcnL forms have different specific 
Items, w'hile a letest uses the items of the original lest Since retest 
icJiabiliiics are cdso sometimes inflated i\hen previous peiioinuinces 
are specifically remembered, equivalent-form reliabilities are pie- 
feiiccl loi test appiaisal 

Reliabilities aie also indicated by the Standard Eiior of Estimate 
of a score based on cqunaJent-iorni correlations Sonic authors prefer 
this indicator because ii gives nnrnecliate knowledge of the probable 
range of a score (Chapter XIII). 

UNIQUENESS 

The most difficult requirement for satisfactory interpretation is 
that characteristic of a test which makes a given score always icp- 
resciii the same pattern of behavior Many tests include items which 
sample different kinds of behavior For instance, if a test includes 
number, language, and spatial items, two persons may earn the 
same score, but lor different reasons. One may do well in language 
but poorly in mathematics, and the other ma) have just the opposite 
skill. 

Factorial Analysis 

As tests have become inoie widely used and scrutinized, the de- 
mands for unequivocal scores have become more numerous In addi- 
tion to sub|cctive analyses to determine the elements of a test, the 
statistical technique called factorial analysis has gained considerable 
usage The several t\pcs of factorial analysis have given somewhat 
different results, but there is also much agreement among them. Each 
seeks to analyze the factois, usually the smallest number of factors. 



62 ACHIEVEMENT AND APTITUDE 

which explain the variations o£ scores of a groi!p of persons on a bat- 
tery of tests. From factorial analyses the amount of variation due to 
each factor may be estimated for each test. This is called /flcior load- 
ing of a test. The factorial composition of a test is indicated by its 
pattern of factor loadings. Certain tests show large loadings of sev- 
eral factors. These are called impure or heterogeneous tests, for they 
do not allow a clear interpretation of results Other tests show large 
loadings of only one factor and small or zero loadings of all other 
factors These are called pure or homogeneous tests, and are con- 
sidered to have high factorial validity. They are more desirable be- 
cause they yield clear interpretations and lend themselves to careful 
analyses better than the less pure test. Chapter XIV illustrates the 
use of factorial analysis in the selection of test items. 

Sampling 

A considerable number of technical studies of homogeneity of tests 
are in progress (Chapter XIV), but a simple check of the validity of 
a test may be applied when one wishes to find out how well two 
samples represent a large block of test items This check is impor- 
tant since mental tests must usually of necessity be short Modern 
tests are often made by selecting about forty items from among sev- 
eral hundred. To check the validity of two short tests which have 
been made up of items chosen from a large number, one must apply 
both to a fairly large group of persons who are thought to differ 
widely in their knowledge of the information tested. Next, the scores 
from the two forms must be correlated. When correlations are low, 
one must conclude that some factors influence the scores on one test, 
but do not influence the scores on the other in the same way. This 
correlation procedure is necessary to eliminate poor samples, but 
it still does not guarantee that good samples of the master list have 
been secured, for factors within the master list may correlate highly 
with other factors not included in it Such extraneous items or factors 
may creep in during the construction or administration of various 
test forms. It is further necessary to prove that nothing but the factor 
one desires to measure is represented by each test score. It is impos- 
sible to prove this with a simple correlation technique, for a correla- 
tion coefficient never shows mental processes but simply coincidence 
tn relative position. For more adequate proof, one must use direct 
observation to appraise the behavior represented by the test scores 

As stated on page 50 reliability is the characteristic which makes 
a test yield the same results when applied a number of times. This 
identity of results is important because a test cannot be quantitatively 
valid unless it is reliable. Its validity coefiicient cannot be greater than 
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its reliability coefficient. Several methods o£ computing reliability are 
described below. 

Item Analysis 

One characteristic of a test which is evidence of reliability is inter- 
nal consistency. A test is considered to have high internal consistency 
when each item or subtest of a battery is found to arrange a tested 
group of persons in the same order of excellence as that indicated 
by the total score. This characteristic is always a valuable one for 
tests that are constructed to appraise one independent factor. Of the 
various methods of evaluating internal consistency, two that are 
widely used will be described — the split-group method and the cor- 
relation method. 

The split-group method first divides a large group of persons into 
subgroups, usually thirds or fourths, on the basis of their total scores 
on a test made up of many items Then the percentages of persons 
in the highest and the lowest thirds who pass each item are found 
(Ulus. 19). If on an item the lowest third does as well as or better 


ILLUS. 19. EVALUATION OF TRUE-FALSE ITEMS 
Per Gents Passing, and Correlations 



A 

B 

C 

Item 

D 

E 

F 

C 

Highest third 

95% 

57% 

96% 

65% 

33% 

95% 

85% 

Middle third 

05 

54 

94 

67 

54 

92 

54 

Lowest third 

50 

59 

97 

63 

68 

64 

56 

Correlation with 
total score 

.74 

14 

07 

-06 - 

32 

.30 

.32 


than the highest thud, one must infer that the item and the total 
test do nor measuie the same processes 11 all oL the highest m the 
gioup pass the Item, and all of the lowest lail it, then the item di\ides 
the group into classes much as the whole test divides the gionp. 

llus spht-gioup method is illustiaied by Andeison’s (1935) report 
on an anahsis ol a 222-item examination in educational psychology 
On the basis of discrimination between the lowest, middle, and high- 
est thirds 80 items weie classified as good, 83 as poor, and 53 as inter- 
medj.ite The stores ol each student on good and poor items were 
calculated '1 he means oL the good and the pool items w^eie nearly the 
same — 50 3 and 55 3 — but the scoies on the total 222 items coirelated 
with the scoies on the 86 good items 95, and with the 83 poor items 
.15 The coi relations of final grade in the course wuth the good items 
was 90, wdth the poor .22, and wuth the total score .89 The split- 
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half reliability for the total examination was £64, for the good items 
.832, and for the poor items .06 

These figures show that the split-group evaluation allowed a selec- 
tion of approximately one third of the total items, which gave as 
good discrimination as the whole examination, and which predicted 
final grades in the course a trifle better than did the whole examina- 
tion, It was further shown that the total items do not correlate quite 
so well with high school rank and scores on the Iowa English Ex- 
amination as do the good items alone The study might well be fol- 
lowed up by further work to show the smallest number of good items 
which would be needed to allow satisfactory discrimination of this 
group of students and of other similar groups 

Another illustration of the use of this method is seen in the work 
of Terman, et al. (1917), (Ulus 38). Here the percentages of high, 
low, and medium IQ groups who passed each item are indicated 
Since the chronological age (CA) was nearly constant for each group, 
the IQ in this case represents the total score of the test Nearly all 
the Items show large differences between the lowest and the middle 
group. The differences between the middle and the highest groups 
are usually smaller, owing to the fact that the items were chosen so 
that the middle group would have more than 50 per cent correct. 
Items which did not show as large discnminations were omitted from 
the Stanford Revisions of the Binet Test 

The correlation method, which is preferred since it seems to in- 
volve less work, correlates the scores made on each item with the 
scores from the whole test, or with some other desired criteria If 
such a correlation is high and positive, it is likely that the item dif- 
ferentiates between persons in the same manner that the whole test 
does. If the correlation is low or negative, success on the item is likely 
to be determined either by chance or by other factors than those 
which affect the test as a whole Illustration 19 shows the results of 
applying both split-group and correlation methods 

Item A is passed by 95 per cent of the highest third, 65 per cent 
of the middle third, and 50 per cent of the lowest third, and its correla- 
tion with the total is .74 This item is considered good in the sense 
that It results in neaily the same classification of persons as the whole 
test A test composed only of such items is said to have high internal 
consistency^ 

Item B, a true-false item, which is usually passed with approxi- 
mately chance success by all thirds, with a correlation so low that it 
indicates a nearly random relationship, is shown by these figures to 
be too difficult for this group of persons. 

Item C IS usually passed by almost all of each third and its correla- 
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tion is nearly zero. Cftarly it is too easy to affect the relative standing 
o£ persons m this group 

Items D and E both indicate ambiguity The figures for Item D 
show that all thirds of the group had the same success and that their 
scores are considerably above chance. There is a small negative cor- 
relation with the total. Item E is passed with less than chance suc- 
cess by the highest third The middle and lowest thirds succeed much 
better on this item. The correlation with the total is — 32. The 
most probable explanation of the results on Items D and E is either 
(1) that the persons in the highest and lowest thirds interpreted 
the Items differently, or (2) that the item measures a skill which is 
negatively related to the skills measured by the total scores. Both 
Items D and E are of doubtful value in this test 

Items F and G exhibit partial success in classifying the persons 
tested Item F distinguishes between the lowest and the middle third, 
but not between the middle and the highest third. The opposite is 
true of Item G. Both items have a positive but low correlation with 
the criterion. Although they may be used in the test, they are not so 
effective as is Item A. Such statistical analysis of the effectiveness of 
items is so easily made that it should frequently be employed. It 
shows conclusively where chance successes and ambiguities affect 
the scores 

Literally thousands of items were analyzed by AAF technicians 
using for the most part a phi coefficient (<^) (Chapter XIV) This pro- 
cedure computes an index between passing or failing an item and 
belonging to either the highest or the lowest part of a large group 
according to total test scores or by some other criterion. The result 
IS similar to a correlation coefficient. This procedure saves time 
since the papers can be grouped into high and low criterion groups 
at the start, and thereafter the frequency of correct answers can be 
rapidly counted visually or by machine The phi coefficient is some- 
what higher for items with 50 per cent difficulty than for either 
more difficult or easier items, therefore when the latter are desired 
they must be found by an additional inspection for difficulties. 

A test may have a high degree of internal consistency but a low 
correlation with any criteria of success. A test may also be constructed 
to have high internal consistency by selecting each item to predict 
some criterion of success. Such a test will be the most valid predictor 
of some practical success For example, Uhrbrock and Richardson 
(1933) divided a group of supervisors into thirds on the basis of 
estimates of efficiency. The proportion of persons in each third who 
passed each of 820 items was found The eighty-five items which best 
differentiated the thirds were found to correlate .71 with supervisory 
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ability, whereas a test of four hundred unsel^cted items predicted 
supervisory success with a correlation of only .49. This study shows 
that by this method the best items tor predicting success can be 
selected rather quickly. Tests constructed by this method must 
always be checked, however, by a tryout on an entirely new group. 
Such a tryout is called cross validation. 

The inclusion in a test of items which have low correlations with 
any criterion produces a measuring instrument which has low in- 
ternal consistency, and prevents any careful analysis of mental proc- 
esses. One cannot safely conclude from such a test that there is any 
clear pattern of behavior or skill present; the opposite is more likely. 
Fairly high internal consistency, however, still does not guarantee 
the appraisal of one independent factor Several factors may be 
represented by the various items or by any one item. Only by a de- 
tailed psychological analysis (Chapter XIV) can the factor pattern 
become known 

This discussion of test validity and reliability is intended to be 
merely introductory It points to the need for better methods of 
appraising the worth of a test— methods that will clearly distinguish 
both qualitative and quantitative aspects No simple method is 
likely to appear, but interested students will find the more extensive 
practical and theoretical discussions well summarized by Brown and 
Thomson (1925), Thurstone (1946), Guilford (1942), Garrett (1947), 
and R. L. Thorndike (1947). 

SCORE SHEET FOR APPRAISING A TEST 

Very few tests are rated above 90 by a good examiner. Illustra- 
tion 20 is a score sheet for appraising a published test Procure a test 
and its manual of directions; then evaluate it on this score sheet. 

ILLUS 2,0 SCORE SHEET FOR APPRAISING A TEST 


Title 



Author 


Publisher 

Range 

No of Forms 

Time Required 

Publisher 

* 

Cost 

References 




Directions* Rate each item, using the following numbers: 

4 for excellent, unusually well done 
3 for satisfactory 
2 for adequate for most situations 
1 for senous omissions or difficulties 
0 for omitted or misleading 

manual 

1. How adequate is the manual (completeness, arrangement, ease of 
reading)? 
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PURPOSE AND PREPARATIOr^’OF ITEMS 

5, How clearly does the author explain what traits are to be eval- 
uated^ What precautions are taken to eliminate extianeous factors? 

Are speed, power, or breadth tests clearly distinguished? . . * • 2 

3 How well did the author evaluate each item? . . • . S 

4 To what extent is a good sampling of the trait guaranteed by selec- 
tion of topics or control of situations^ Are enough items included 

for a reliable measure^ • • • ■ ^ 

6 To what extent are items ambiguous or stated with too difficult 

vocabulary levels? Are pictures well drawn^ .... 5 

6 How well are items scaled to give equal units and the needed range 

of difficulty^ • • • 6 

ADMINISI RATION AND SCORING 

7. How well IS the purpose of the test explained, and students moti- 
vated to cooperate^ • 7 

8 How adequate are directions and practice periods^ ... 8 

9. How hard is it to find the right place to put the answer? Separate 

answer sheets often make this a source of enor , . 9 

10 How hard is it to find and follow directions for administration? . 10 

11 Are time limits well set> • « • H 

12 Is the test scored automatically (4), by machine (3), by hand (2), 

subjectively (1)’ - * 12 

13 Are adequate spaces provided for recording results? . . 13 

14 Are corrections for chance success adequate? . • . . H 

15 Are scoring weights used appropriately^ # • • . 15 

NORMS FOR INDIVIDUAL PREDICTIONS 

16 Are the shapes of distributions clearly given^ How do they compare 

with a normal distribution^ What deviations should be expected? . 16 

17 Are the norms given for large representative samples? . . 17 

18 Are norms given for sex, age, grade, occupation, or other needed 

groups? • 1® 

19. Are norms well presented by cen tiles, age equivalents, standard 

scoies, I G *s, etc^ 18 

RELIABILITY 

20 How adequate are the indications of reliability^ (Method used, 
groups sampled) Are standard errors of estimate reported^ Are 

some parts of the test more reliable than others^ . . 20 

21 Is reliability enough for individual predictions^ ... 21 

VALIDITY How well IS Validity established by* 

22 Cross validation? .... 22 

23 Difference between criteria groups? . . 23 

24 Judgments of experts^ ... 24 

25. Correlations with other measures? ... 25 

26 Factorial analyses? . • 26 

RATER Total Score 
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'f 

ILLUS 21 COMPARISON OF GROUP DISTRIBUTIONS 



STUDY GUIDE QUESTIONS 

L What determines ease of administration? 

2 How IS the optimum difficulty of items related to the group to be 
tested? 

3. What are objective tests^ What are their limitations? 

4 Describe the use of stencils, machines, and automatic scoring devices. 

5 How is one's place in a group indicated^ 

6 Show why the group must be fairly adequately described in order to 
yield a true interpretation 

7. What are validity coefficients? 

8 What is face validity? How much can it be relied upon? 

9 What is test reliability? 

10. How is test uniqueness or purity defined? 

11. What do the factor loadings for each test show? 

12. How can two equivalent samples of a large group of items be pre- 
pared? 

13. How may the Brown-Spearman formula be used to indicate the num- 
ber of Items a test should have to reach a desired reliability coefficient? 

14. What is meant by internal consistency? How may it be used to im- 
prove a test? 

15. Define mean, median, centzle, norm, coefficients of correlation. 




CHAPTER IV 


CONSTRUCTION OF 
TEST ITEMS 


In Chaj^ter III the standards for judging the woith o[ a test '^vere 
desciibed This chapter outlines the ielau\e .idvaniagcs of using the 
various t)pes of items, gnes certain rules lor then constiuriion, and 
discus^ejj chance success, guessing, and collections lor guessing 

WHEN SHOULD NEW ITEMS KE CONSTRUCTED? 

^Vhdcly standaidized tests have the advantage ol being, foi the 
most part, well consti ucted, reliable, and accompanied by usehil 
noinis, but the) olten contain elements that are not wanted and 
omit mateiiai that is wanted loi classioom examinations ol a par- 
ticulai course, stanclaul acliievenicni tests ran seldom be used, be- 
cause courses vary considerably from school lo school, and even 
w’lthiii the same school Likewise ioi industnal oi iniliiary iraiiiing 
or selection, the job icquirenicnts olten need lo be iaiil) spccihc 
To meet these special needs, usually local ones, good examinations 
should be picpared. Essays or recitations have been used foi a long 
time, and arc still ihc only saiislactorv means oi appiaising composi- 
tion and speech Howevci, items having Irom two to five choices not 
only icsiilt in great cconoin) in scoring, but allow the author oi the 
test li he is diligent, to get precise mdicaiions of the degrees of 
mastery ol the knowledge oi ^kill which he wishes to measuie. 

At first acquaintance, tests u^ing shoit items seem to be the answer 
to the overworked instructor's need for valid and easil) consiructed 
tests oi information oi skill. This belief is sadly dispelled by a few 

o9 
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attempts to construct items. Even items whicJi have been worked 
over by several persons often prove of little value. Thus, Wood (1927) 
found that almost 30 per cent of the items used in an examination of 
law students were ineffective in discriminating the poorer from the 
better students. Anderson (1935) found that one third of the items 
used in an educational psychology test were more adequate for 
appraisal ot the work of the course than was the whole test Some of 
the Items were merely useless for distinguishing the bright from the 
dull students, and some introduced errors of measurement through 
ambiguities. 


TYPES OF TEST ITEMS 

Items may be classified into types according to the number and 
arrangement of their elements. The type determines to a large 
extent the economy of administration and scoring, and the presence 
of factors causing chance errors Type of question also determines 
the processes needed for success, but to a smaller extent. 

It should be stated before proceeding with this discussion that, 
regardless of the type of question, the best test for a particular ap- 
praisal IS that which has the largest proportion of valid items in it. 
Validity, discussed in Chapter III, cannot be known very definitely 
before the test has been tried out on a fairly large sample of persons. 
Suppose, hoivever, that a number of different types of items which 
cover the same material have been tried out on a group and that the 
various types all show significant correlations with the criteria It is 
then reasonable to ask which type of test question is the best to use 
in a particular situation. Five types of commonly found items are pre- 
sented in Ulus. 22. 

MERITS OF VARIOUS TYPES OF ITEMS 

If the purposes and time limits of a testing program are definitely 
fixed, tlien the question, which is the best type of item? can be an- 
swered. Some of the answers are summarized by ratings in Ulus 23 
These ratings represent this writer’s opinion, based on numerous 
observations as well as on a number of studies by other investigators. 
Since ratings depend to a marked degree upon personal experience, 
they might vary considerably if made by another judge. Thus, the 
ease of composition of a particular type of item is dependent, at 
least in part, on one’s experience To a lesser degree judgments of 
ease of administration and scoring are reflections of personal ex- 
perience in these activities. For many test situations, however, the 
advantages and disadvantages of various forms are fairly clear. 
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ILtUS 22. TYPES OF TEST ITEMS 


1 True-False Items These require a judgment that a given statement is true 
or false as 

T he sum of 6 and 7 is IH. T F 

2 Completion-Tv PE Items These direct the subject to complete a picture or a 
statement by supplying an appropriate element as: 

The sum of 6 and 7 ts 

3 Multiple-Choice Items These call for a choice of a correct answer to a 
question from se\eral incorrect answers as 

The sum of 6 and 7 is 12, 13, 14, 15, 17 

4 Matching Items These demand that each of a list of elements, usually about 
10, be matched for significant relationships with elements chosen tiom another 
list as 

IVTite the number of the book in front of its auihor^s name* 


Author 


Book 


Poe 

1. 

Faerie Qiieene 

DeFoe 

2 

Plain Tales from the Hills 

Dickens 

3 

Pickwick Papers 

Spenser 

4 

The Gold Bug 

Tennyson 

5 

The Taming of the Shrew 

Byron 

6, 

Don Juan 

Kavanagh 

7 

The Four Hundred 

Kipling 

8 

The Idylls of the King 

Conrad 

9 

Loid Jim 

Shakespeare 

10 

The Odyssey 


11 

Robinson Crusoe 


12 

The Pearl Fountain 


Essay Items Describe briefly the mental process involved in multiplying a four- 
place number by a two-place number. 


Illustration 23 shows the true-false test item ranked first in three 
aspects: administration, ease of scoring,^ and short time per item. It 
is given the second rank for economy of printed space, ease of com- 
position, and clarity, and the third, for freedom from chance success, 
dependence on recall rather than recognition, and analysis of results. 
Complexity of thinking on true-false items is usually small, though 
It can be made great; hence no definite rating is given. This aspect is 
assigned a question mark. True-false test items are probably the best 
to use 

L If one is faced with a situation in which time is short for com- 
posing, administering, and scoring a test. 

2. If the test will have to be scored by clerical helpers who are igno- 
rant of the subject. 

1 Another aspect, accuracy of hand scoring, was investigated by Dunlap (1938). 
From a rescormg of 398 Terman Group Tests of Mental Ability it was found that 
true-false and 2-choice items had nearly twice as many errors as completion or 
multiple-choice items Nearly 10 per cent of items were mis-scored on the true-false 
tests — a serious error 
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S, If complexity of thinking and recall of information are not con- 
sidered as important as a wide range of information. 

4. If occasional chance successes and failures are not too serious. 

ILLUS 23 MERITS OF TYPES OF ITEMS 

(1 is the highest rank; 2, next highest; and 3, lowest) 

Type of Item 

1 2 3 4 5 6 

True- Corn- Mtd- Match- Rear- Essay 

False pleivm Uple- mg range 

Choice 

2 2 3 2 2 1 

1112 2 1 

13 111? 

2 2 2 1 3 1 

12 1113 

3 11113 

? ? ? 1 1 1 

2 3 1112 

3 2 3 2 2 1 

3 2 1 2 2 1 

* Evidence showing why the examinee failed ; types of errors or omissions. 

In Ulus 23 the completion type of test item is rated high in ease of 
administration and freedom from chance successes, and second in 
ease of composition, amount of printed space, ease of scoring, de- 
pendence on recall rather than on recognition, and analysis of re- 
sults. It IS third in the length of time needed and in the clarity of the 
questions, but complexity of thinking is not rated for this type. 
Some, who prefer the completion to the true-false type, believe that 
completion items are less likely to suggest wrong answers and to en- 
courage superficial thinking. 

A type of test item which combines some of the good points of 
both the true-false and the completion types is the multiple-choice 
type It is administered and scored about as fast as the true-false type, 
and is almost as free from chance errors as the completion type. A 
multiple-choice item is particularly well suited to mathematics, spell- 
ing, and vocabulary tests. It is much more widely used in standardized 
test forms than any other type of item. 

Matching-test items are particularly well suited to the classification 
of facts, such as are found in Chemistry (Ulus 12), and Mechanical 


1. Easy to compose 
2 Easy to understand direc- 
tions 

3. Short time per item 
4 Little printed space per 
item 

5. Easy to score , no partial 

credits 

6. Free from chance success 

7. Complexity of thinkmg 

8. Question clear, not a puzzle 

9. Dependence on recall, 

not recognition 
10 Analysis of results * 
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Appliances (llliis 14/ They allow some analysis ol results and ehini- 
nate chance successes laiily well In Ulus 23 they aic gnen the high- 
est laiik with regard to six aspects, and the seconcl rank wirh regaid to 
the other lout 

A reaiiangemeiif-test iiem has approximately the same good points 
as a inatcliiiig-test item, and mav allow slightly moie precision iii scoi- 
ing Diiections lor matching and loi leaiiangement oL items aie not as 
easy to understand as the diicttioiis loi some o( the oihei t)j^es, hut 
the) can be readily giasped by the avciage sixth giade pupil, oi an 
adult who IS above the lowest 8 per cent oi the poimlation in ac aclemic 
status 

An cssa)-type item calls lor an oial or a written explanation or ex- 
position oi iacts It IS one ol the easiest types of item to compose, )et 
the haidcst to scoic, since innumerable variations appeal in the 
answers It usuaJiv demands more recall and otgani/ation ol ideas 
than any of the othci njjes Foi this reason the essay-type item is 
widely ii'jed in appraising problem-solving abilities ol advanced stu- 
dents It is also tlie most usual w'ay oi appraising vcilial coinposiiions 
it IS the only type oi test in wdiicli the examinee has an oppoiluiiity 
to give e^pTC^SIon to his individual st)le and poetic lancy In the field 
oi liieratuie it can never be replaced by short-answci items Since 
long essiiys are moie diincult to store than shou ones, a common 
j^ractice in educaiional tests is to limit ihc es^ay to IjO words, or to 
a definite space This limiialion has the advantage of making the 
examinee think out his answer clearly before he writer it 

Anothci approach to the iclative value ot vaiious loims of iicras is 
the eftect v\liich they may have upon studenis’ methods of study 
Mcyci's (1936) leport is t)pical of sevcial otheia He told equated 
groups, oi students to prepare ioi one ol the lollowmg types ot ex- 
aininaiions true-fahe, multiple-choice, completion, and e^say. At 
ihe end oi regulated study periods all gioups wci e given all tour types 
ol examinations He found that the students who had prepaiccl for 
the essay tests showed better average scores on all lorms ol tests than 
the oihci stiidenis Those who had piepaied lor the coinpletion-ryi^e 
tests came next, while those udio had prepared ioi the tine-false and 
niuliiplc-choice tests had almost the same avciage scores 

A nunibei ol persons have tiicd out tests which combine two of the 
foi ins just mentioned A true-false lorm was combined vsitli a comple- 
tion iorni by McClusky (1934) v\ho asked students to correct the 
items which they had maikecl lalsc by wriiing additional iacts on the 
test form This procedure makes the administration and scoiiiig oi 
this lest much longer than is tiuc ol a tiue-false test, but it may give 
a clearer picture ol a student’s knowledge ol tlie subject To a 
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limited extent it is thought to avoid the retention of false informa- 
tion from false items. 

Other tests combine the multiple-choice and completion forms. 
Curtis (1928) gave his students four choices for each item and also 
a blank in which a still more appropriate answer might be written. 
This combination increases the time needed for administering and 
scoring the tests by a considerable amount, depending upon the par- 
ticular questions and students. It has the advantage of making the 
students think harder to give a good answer than does the multiple- 
choice form It is also of use to those who wish to revise multiple- 
choice test Items, for students often write somewhat plausible but 
wrong answers which, if included as given choices, would make the 
test item more discriminating than before. This source of alternate 
choices is important since it is usually difficult to find three wrong 
choices which are sufficiently like the right choice to make the dis- 
crimination as difficult as it needs to be, and which are not so close to 
the right answer as to be considered synonymous by some authori- 
ties. 

Another interesting combination is made by directing examinees 
to read a statement, and then to answer short questions about it. The 
questions may be true-false or multiple-choice items, such as: 

Place a plus sign (q-) in front of each item that supports the first statement, 
and a zero (0) in front of each item which does not. 

RELATIVE TIME ALLOWANCES 

Ratings similar to those in Ulus. 23 on time per item have been 
determined somewhat more precisely for special cases by the study 
of the penods actually needed by various persons When various types 
of items are so constructed as to have nearly the same content, as 
reported by Ruch and Stoddard (1927), then the relative time per 
item can be determined with considerable accuracy Completion 
items and items which had seven choices were both answered at the 
rate of about 4 per minute; 5- and 3-choice items at the rate of 5 per 
minute, and 2-choice items at 6 per minute. These figures apply to 
particular historical information tests given in high school. 

The writer has found that, m general, easy information items 
show small diflcerences in time between completion and true-false 
forms. Difficult information items are answered much more quickly 
in the true-false than in the completion form, sometimes five times 
as rapidly. When complexity of reasoning is an important factor m 
suc^ss, then the time needed per item increases greatly. When the 
various forms of items are not constructed so as to be nearly equiva- 
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ILLUS 24 

KNOWLEDGE AND SKILL REQUIREMENTS— 




CLERICAL CLASSES 
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lent in content, it is not possible to predict an/’typical time require- 
ments. 

The fact that 2-choice or true-false items are answered more rapidly 
than the other forms of items makes it possible to use a greater num- 
ber of 2-choice items in a given period. Thus the true-false tests may 
make up for their chance errors of measurement by including a larger 
number of items (Ulus. 25, columns 1 and 8). 

DETERMINATION OF SPECIFIC GOALS 

Before starting to construct test items the first task is to set specific 
goals for your examination If information learned in a particular 
course is to be measured, then the syllabus of the course should be 
followed in detail. If knowledge required for an occupation is to be 
tested, then a master list of important facts is needed. For determin- 
ing the knowledge requirements of a job, Guilford and Lacey (1947) 
and others have pointed out that job analyses are important. When 
making a job analysis the basic measurable primary traits should be 
kept m mind as well as the specific knowledge required. A job 
analysis should also show the most difficult elements of the job, and 
from a study of successful and unsuccessful workers discover the 
trait differences among them. 

Civil service examiners usually construct tables in great detail to 
help them prepare thorough examinations, as shown in Ulus. 24. 
Here the topic, amount, and level of difficulty of each requirement 
are indicated for various clerical classes. When a large file of items, 
classified by topic and difficulty for a known group, is available, a 
test can be quickly compiled. Many civil service testing units now 
have more than fifty thousand items on hand Over a period of a 
few years an individual teacher or a group can develop and list on 
separate cards several hundred tested items Each item should show 
the dates when it was used, the degree of difficulty, and the validities 
for each group. 

RULES FOR TEST CONSTRUCTION 

The following rules apply to the construction of almost all test 
items, but they are particularly applicable to objective- type tests. 
First general rules are given, then more specific ones. 

General Rules 

1. Us6 posittU6 st(itevi€TitSm This rule may be broken occasionally, 
but negative statements often lead to confusion, particularly m true- 
false statements Thus the item. 
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T F An IQ of 70 is usually not exceeded by the lowest 1 per cent 
of a large random population 

is true, but it is less likely to be misread if it is revised thus: 

T F An IQ of 70 is usually exceeded by the highest 99 per cent 
of a large random population 

2. Avoid the unqualified use of words which have two or more 
meanings. This rule is hard to follow because one's own mental 
set usually gives only one interpretation to a statement at a time. 
Thus the essay item, 

Describe the principal factors in mental-test success 

permits at least two interpretations Some persons interpret it to 
mean describe the mam factors which allow an individual to succeed 
on a battery of tests. To others the question seems to be concerned 
with the factors' which make a test successful on the market or in a 
particular survey Therefore the item should be revised to establish 
the correct interpretation. 

3. Avoid using unnecessanly long or rare words except when they 
are to be defined. Otherwise you may have a test of vocabulary rather 
than one testing what you are trying to measure. 

4. Incorporate only one independent idea in a question. This rule 
IS particularly important in short-answer items. Thus the item, 

T F A correlation coefficient may show the amount of coincidence 
of ranks on two tests and the reliability of the tests, 

has a first clause which is true, and a second clause which is false or 
incomplete as it stands. It should be made into two items, one includ- 
ing the first clause, and the other somewhat as follows* 

T F The reliability of a test may be shown by a correlation coeffi- 
cient based on two trials of the test on the same group of persons 
given on the same day. 

5. Avoid broad generalizations of time or place. Violations of this 
rule are fairly common and very serious because the better informed 
students will usually know of more exceptions to a generali 2 ation 
than the other students Thus the item, 

T F The best test for selecting women machine operators was 
found by Miss Hayes to be a simple measure of dexterity on a peg 
board, , 

was supposed to be true according to one interpretation of the article, 
but according to other interpretations it was false. It should be re- 
vised to be less inclusive, thus. 
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T F The best te« in point of administrative costs for selecting 
women machine operators was found by Hayes to be a simple 
measure of dexterity on a peg board, 

Another example of too broad a generalization is: 

T F The usual correction for chance success or failure in true- 
false tests is justified when students are told not to guess 

This item is theoretically true, but there are many sjituations in which 
it would not be true practically. It should be revised to cover a more 
specific situation, sudi as: 

T F The usual correction for chance success or failure on a 50- 
item true-false test is desirable when some students have omitted 
a considerable proportion of the items. 

6. Avoid telegraphic brevity which leaves one in doubt as to the 
meaning. Thus the completion item. 

The meanings of words depend upon , 

is too brief for a simple interpretation — one that can be made by a 
single phrase. Also, the fourth grade history item, 

Lincoln’s policy toward the slaves was one of , 

has too many possible answers 

7. Read the test item aloud By doing this one can often discover 
how a statement can be made more simple. 

Specific Rules 

Multiple-Choice Items. Each multiple-choice item consists of a 
statement which is followed by the several choices The statement, 
which may be a long paragraph or a short phrase, should be devised 
according to the rules given above, and be devoid of unnecessary 
padding or irrelevant material It should not be a test of ability to 
read complex material unless that is the purpose of the test. Even 
when a good statement has been found it may be far from being a 
satisfactory item because its alternatives may be hard to find and 
arrange. The following suggestions are useful for the construction 
of multiple-choice items: 

a. Make use of completion-test responses One of the best ways of 
securing good alternate but wrong responses for a multiple-choice 
test which has been used many times with good results, is to try out 
the same items first as completion tests. Some of the wrong answers 
are likely to be more diagnostic than alternates which a group of 
examiners or experts might think up. 

b. The position of the right answer should not always be the same. 
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However, numbered’answers should be placed in numerical order so 
that they can be found easily 

c. Avoid suggesting the right or wrong answer, thus. 

1) Choices which are not grammatically in line with the begin- 
ning of the item are usually the result of patch work, and 
hence are wrong answers. 

2) An unusually long, detailed choice is often the right answer. 

3) Choices which contain always or never are likely to be false. 

Matching Items. With questions accompanied by two lists of 
items to be matched, one list should contain one or two more items 
than the other, so that the process of eliminating known answers will 
not lead to correct choices of unknowns. 

DETECTING AMBIGUITIES 

Two methods of procedure have proved effective in detecting 
ambigjious items {a) observation of behavior and (6) statistical anal- 
yses Observational methods are of great value in both performance 
and verbal-testing evaluation Ambiguity of scores can be detected in 
a performance test by watching several persons work Thus, in as- 
sembly or manipulation tests, the observation may show that one 
person interpreted the task quite differently from another, and solved 
it differently Scores in such cases represent different skills and are 
therefore not comparable. The most valuable results from such situa- 
tions are often not numerical scores, but descriptions of how different 
persons went about the work. Thus one person may assemble a lock 
slowly and without necessary movements and achieve the same time 
score as another person who moves fast and tries out a number of 
more or less random combinations before hitting upon the right one. 
The time score alone cannot be given an unambiguous interpreta- 
tion. For the most adequate measurement of skills it is necessary to 
arrange the test situation or the directions so that both persons will 
use nearly the same processes. 

In verbal tests ambiguities of interpretation may often be directly 
observed either by taking the test and recording your own puzzling 
experiences, or by asking others to tell you how they interpret each 
item. Subjective analyses of this sort are practical. In the evaluation 
of a local objective test, one should, if possible, allow a period of a 
week or more between its construction and its inspection, in which 
partly to forget the original mental set or bias which determined the 
form and the selection of certain test items, and to take another point 
of view. Sometimes, after such an evaluation, the results are startling. 
After a subjective analysis has been made and the test has been applied 



70 ACHIEVEMENT AND APTITUDE 

to a group, a statistical analysis should be made^ as described in Chap- 
ter III. 


CHANCE SUCCESS 


Theoretical Considerations 


In tests that require the examinee to choose one of two or more 
answers that are presented to him, there is an opportunity for suc- 
cess without knowledge Thus, in desperation many a student faced 
by an unknown item has taken out his lucky penny and given his 
answer according to its perfonnance. 

If all students answer all items in true-false or multiple-choice 
tests, no statistical correction for chance success will change their 
relative rank order. For this reason and because guesses are usually 
more often right than wrong, some examiners ask all persons to guess 
when they are not sure, but if some persons are cautious and omit 
the doubtful items, then corrections for chance successes ^sually 
change the rank order of the examinees In many standardized tests 
persons are urged not to guess. Whatever directions are used, how- 
ever, it has usually been found that some persons will guess to a con- 
siderable extent and some will not. 

When there are but two choices, as in true-false items, and there 
are equal numbers of true and of false items, then half of the items 
would be guessed right on a chance basis alone If we assume tliat a 
person’s wrong items are doubtful items at which he guesses, and that 
he got as many doubtful ones wrong as right, his chance successes can 
be removed from his score by subtracting the number wrong from the 
number right (R — W). 

When three choices are presented for each item, the mean chance 
success on a number of items is one in three, and the corrected score 

is the number right minus one half the number wrong, R — 


when there are four choices, the corrected score is R — 


W 

3 ' 


and for 


five choices, R and for n choices, R — ^ , • 

These corrections are probably not so effective as is often supposed, 
for the assumption that all wrong answers are pure guesses is doubt- 
less unwarranted in a large proportion of the cases Moreover, in 
multiple-choice items the discriminations between alternatives are 
seldom of equal difficulty — ^an assumption made in applying the 
formula for correction Furthermore, the person who is lucky ex- 
ceeds the mean chance success and gets credit for some of his guesses 
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even when the conection is applied, and the jierson who is unlucky 
gets less credit than he should. This latiei case is probably larc, be- 
cause a {person’s hunches, and soinetiines even his shcei guesses, arc 
usually a little better than chance. Oltcn one has some iniormafioii 
which can be used in selecting an answ^er, but wduch is not judged 
to be complete. 

Piactically, one may decide whether or not to correct for possible 
chance successes and lailures Iioin einpiucal eMdcnce and the m- 
tendecl use of the results Let ns see what happened wdicn actual test 
results w’eie corrected lot chance eltccts 

The Results ol Correcting foi Chance 

A vei) good e\ aluatiori of the lesults of correcting for chance is seen 
in the woik of Ruch and Stoddard (1925), who compaied the follow- 
ing live types of tests in Ament an histor\ 

1 Recall* The Anieiican Re\olution began in the year 

2 Fi\c choices TheAnieruan Revolution began in 1702 1775 

1785 1789 1812 

5 Thiee choices The Amcruaii Revolution began in 1762 

1775 1789 

“I "I wo choices 1 he American Resolution began in 1762 1775 
5 True-False 'Ihe Aniciican Revolution began in 1773 . T F 

Tw^o loiiris of each l)pe weic constructed, each containing one 
hundred items These were then gi\en to laige groups of higli school 
students The recall type w^as always given fust, follow^cd b\ one of 
the otliei types Some ol the results arc shown rn Ulus. 25. 'The first 
column indicates that on the average more than Lw^ice as inanv items 
w’ere chosen correctly as were iccalled When the thcoietical cor- 
rection for chance success was made (roluiiin 2) the tiue-Ialse tyfie 
showed appioxiniatcly the same mean scoie as the recall type, but 
the other tests had considerably higher means Columns 5, 4, and 5 
contrast the iheoretical chance success with the actual chance suc- 
cess. Column 5 show's that the 5-choice t)pc had the gicatcst excess 
of actual over theoretical success. The excesses wcic probably clue 
to a small but inipoi tant additional context Lurnished by each choice 
that was added 

The reliabilities, as showm by con elating two equivalent forms, 
are given in columns 6, 7, and 8 The scoies which had been cor- 
iccted for chance have nearly the same leliabiliiies as the nncorrected 
scores. Column 8 indicates that all the iorms would have had nearly 
the same sell-coiiclations if they had been constructed to fill the 
same peiiod of time Another important finding, not shown in the 
illustration, is that all ol the choice tests showed larger standard de via- 
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tions when corrected for chance. This increase is due to the fact that 
students with lower uncorrected scores lost more by the correction 
than those with higher scores. 

The Results of Telling Students Not to Guess 

In a similar investigation Ruch and Stoddard (1927) directed half 
of a large group of students to guess, when in doubt, and others not 
to guess The results are shown in part in Ulus. 26 for 2,463 pupils in 
the seventh, eighth, eleventh, and twelfth grades The raw-score cor- 
relations of any two equivalent forms were slightly higher among 
students who were told not to guess than among students who were 
told to guess Self-reliabilities were thus improved (compare columns 
1 and 4), and corrections for chance successes were then of little 
value (compare columns 4 and 5) This result indicates a slight ad- 
vantage in telling students not to guess The illustration also shows 
that some students did not guess when asked to do so, for if all stu- 
dents had answered every item, the correction for guessing would 
not have changed anyone’s position in the group, and the correlations 
for corrected and uncorrected scores would have been the same in 
columns 1 and 2 

The correlations of the scores on choice tests with the recall-test 
scores were nearly the same for both groups of students. The cor- 
rected scores corresponded little better than the uncorrected to re- 
call-test scores. If one assumes that the recall type of test is the most 
valid form of measurement, then he must conclude that directions 
to guess or not to guess had little effect upon the validity of the 
choice-types of tests 

Similar results were secured by Toops (1921) using Trade Tests on 
college students, and by Andrew and Bird (1938) using questions in 
psychology courses These studies lead to the conclusions that 

1 The five types of test items measure different information and 
skills only to a small extent. 

2 All types have high and similar reliabilities when work periods 
are the same. 

3. Theoretical corrections for chance reduced total scores, in- 
creased standard deviations, and usually did not change reliability 
or validity 

4. Directions not to guess reduced mean raw scores from 10 to 20 
per cent, reduced the mean corrected scores between 2 and 3 per 
cent, and increased the reliability and validity slightly. 

6. The true-false and 2-clioice tests showed more chance results 
than the other tests, and hence larger corrections for chance. 

Theoretical corrections for chance are not applied widely on 
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multiple-choice items! but are usually applied to true-false or 2-choice 
Items, because 

1. Corrections for chance reduce total scores, and so may give a 
more accurate picture of one's actual ability. 

2 Corrections usually increase individual differences in raw scores. 

3 The validity of a test, it is believed, may be improved by cor- 
rections for chance. (On this, however, no clear evidence is available.) 

4 Corrections for chance will be a protection against criticism. 

Avoidance of Chance Results 

The discussion of chance errors and successes leads to the conclu- 
sion that in many forms of tests they may be small enough to be 
Ignored, but that there are no sure ways of entirely correcting for 
chance errors in individual scores. The three ways of avoiding chance 
successes through methods used in construction of items and tests 
are. 

a. The test may include a large number of items. This allows the 
law of averages to reduce the chance errors in the score. If one hun- 
dred or more true-false items are given, the errors in scores due to 
chance are usually relatively smaller than when only ten are given, 
A person's luck is more likely to change during a long test than dur- 
ing a short one. 

b. The items may have four or more choices; or completion items, 
where the possible wrong items are numerous, may be used.^ A multi- 
ple-choice test Item with four choices allows on the average only one 
third as many to be guessed right as guessed wrong. In tests composed 
of such items the correction for diance is usually a small proportion 
of a total score. 

c. Persons may be asked to indicate whether they are sure or doubt- 
ful of their answers, and then more credit may be given for sure 
answers. The proportions of sure answers which are correct are also 
indices of caution in the test situation. This index of caution has 
not been widely studied, but it seems to offer an interesting indica- 
tion of a personality trait. West (1923), Trow (1923), and Greene 
(1929, 1938) found tendencies for persons to vary in cautiousness ac- 
cording to the type of material, their familiarity with it, and their 
temperaments and training in exactness In general the more abstract 
material showed lower percentages of sure items correct than did the 
concrete material, and persons with high scores had much higher 

2 A completion item may, however, involve a choice between two easily recalled 
alternatives, such as* 

Plato was born Aristotle. 

The Amazon River is in length than the Nile, 



76 


ACHIEVEMENT AND APTITUDE 


percentages of sure items correct than did the poorly informed. There 
was a marked tendency for the less able students to mark nearly as 
many items “sure” as did the more able students. 

^ STUDY GUIDE QUESTIONS 

L Why should teachers and employment examiners be alert to prepare 
needed items? 

2 What are the relative advantages of essay-type and multiple-choice- 
type items? 

B How can the merits of various types of items be determined? 

4. Even though chance successes are more common m the true-false type, 
why are true-false tests and 5-choice tests usually of about equal reliability 
when given the same time allowances^ 

5 What are the advantages of having a file of tested items? What facts 
should be given for each item? 

6 How may the specific goals of a test of information be established? 
Of a test of problem solving? Of a test of attitudes? Of a test for job require- 
ments? 

7. Prepare fi\e multiple-choice items on the contents of this chapter, 
following the rules for item construction 

8 Review the material on item difficulty and validity m Chapter III 

9. To what extent did Ruch and Stoddard find that varying the type 
of Item changed the reliability of the tests^ 

10. To what extent did directions to guess or not to guess affect the valid- 
ity of the test? 

11. When all persons answer every item, what statistical corrections for 
chance success are desirable on a true-false test? 

12. What can be done to reduce or avoid chance answers? 



CHAPTER V 


TESTS OF 

EARLY CHILDHOOD 




This chapter reviews some of the standard scales which have been 
designed to measure growth in bodily control, social relations, ]jcr- 
ception, language use, and problem-solving ability Also, reading 
readiness tests are described, and several examples of individual and 
group predictions are shown. 

INTRODUCTION 

Interest in early stages of human development has a fairly long 
history which has been well reviewed by Baldwin and Stecher (1924) 
Among the more important early contributions to this subject aie 
The School of Infancy by Comenius (1628), and Rousseau’s A mile 
(1762), a discussion of individual freedom. These were followed by 
Basedow (1770), Herbart (1898), and Froebel (1826), who desciibed 
and applied liberal methods of education, and led the way for a 
number of treatises on infants, biographies of babies, and schools for 
early training. Four developments have given new impetus to studies 
of young children in the United States: 

1. The psychoanalytic movement, which has stressed the impor- 
tance of early emotional maladjustments, as discussed by Anna 
Freud (1925), Kanner (1935), and Thom (1922) 

2. The establishment of day nurseries, partly from philanthropic 
motives 

3. The theories of a group called behaviorists — ^Krasiragoiski 
(1907), Watson (1913), and Weiss (1925) — ^who emphasized di- 
rect observation of behavior 

77 
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4. John Dewey’s emphasis since 1899 on the experimental and 
social aspects of early education 

These developments have led to the gradual growth of experi- 
mental training centers. One was started at the University of Chicago 
in 1906, and others soon after at the University of Iowa, the Merrill- 
Palmer School in Detroit, Yale University, the University of Min- 
nesota, and the University of California Now a fairly large number 
of colleges and universities have well-planned centers for the study 
of early childhood. 

The three main problems for research which these centers have 
attacked are much the same as those dealt with in studies of older 
persons, namely. 

1. What is the mean curve of development for a particular func- 
tion and what deviations are found in a normal group? 

2. Which sequences of development are usually found, which, only 
occasionally? 

3. To what extent are developments in various activities inde- 
pendent of each other^ To what extent may they be explained as due 
to general factors^ Do some inhibit others? 

To answer such questions requires accurate observations over long 
periods of time and the collection of significant data on growth of 
the same individuals. To render more accurate observations, a num- 
ber of tests have been designed, 

Binet included fourteen tests for children from three to six years 
of age in Ins 1905 Scale Goddard (1910) and Terman (1916) trans- 
lated these and added other tests for the same age groups Kuhlmann 
(1922) developed a scale which included five items each at the 3-, 6-, 
12-, 18-, and 24-month levels, and eight tests each at the 3-, 4-, and 5- 
year levels. During the last quarter of a century a score or more 
tentative scales have been published which include observations of 
motor development, perceptual activities, and complex verbal ad- 
justments. This book will not attempt to review the voluminous lit- 
erature on prenatal and neonatal development Tests for infants 
generally include age groups of three to eighteen months, and 
tests of preschool children diose of eighteen to sixty months. There 
is no sharp line between these two, but standardized tests for pre- 
school children are more numerous. 


DESCRIPTION OF SCALES 

There is a marked trend toward analytical procedures in the con- 
struction of preschool tests, but as yet conclusive analyses of factors 
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are lacking. A list of itnponaiir preschool scales is given in Appendix 
II, se\en ol tlicsc arc described in this section. 

Gcscll and Amatruda (lO'I?) Developmental Diagnosis 

The work of Oesell and his associates at the Yale Instiuite of Hu- 
man Relations, 'which has been caiued on since 1023, has been the 
most deUijJcd and extensue in the field ol inlaiit and cliild develop- 
ment Then 1917 book is a lesisioii ol scscial earlier volumes Iiom 
1927 to 1931, 107 inlants weie gi\en monthly examinations and 
elaborate aiipraisals ol their social and ph)sical cmiionmeiits were 
made The inlants were selected irom homes ol middle socio-eco- 
nomic status 1 heir parents 'iv'cre of Vorth European cxtiac tion Rirth 
history, gestation period, and phtsical status 'v\ere held within spec- 
ified limits to gi\c the gioup lurthei homogeneity, which w’as con- 
sidered desirable in that it results in less sariability ol noims lor a 
gioup- 

The materials used included a standard crib with a movable plat- 
form, a cup, spoon, saucer, set oi cubes latile, bell, ring on a stung, 
crayon and paper, pictures a box, and a ball These an icles were oi 
minute specifications oJ si/c and shape coloi, w’Cight, and texture to 
insuic standaul testing conditions 

The reactions ol the 107 inlants to standard picsentations of these 
articles were lecoided on printed sheets, and also on nvelve thousand 
feet of film, and in a laige atlas Printed noims show the peiccntagcs 
oi each age gioup that responded in a paitirnlai mannei to a particu- 
lar situation There w'ere, lor example, 84 items listed untlei bc- 
ha\ior 111 response to the hell, 125 items icCcr to responses to one or 
more cubes, and 48 to the presentation with a cup llhistrations 27 
and 28 give the nouns loi cup behavior 

Three kinds ol items appear in this illustration increasing, de- 
creasing, and local items ihe inci casing items are those which show 
larger percentages wuth increasing age as in 1, 6, 7, and lb The cle- 
ci easing items show smallei peicentages with age, as in 2 and 8. The 
focal Items at fust sho'tv huger percentages wuth increasing age, but 
later show smaller percentages, as m 3, 25, and 30 From tables such 
as these, the * ciitical age foi each item was detcimiricd koi iiicicas- 
Jiig items the critical age is the first age at w'hich an item is jDassed by 
50 per cent of the gioup, lor local items the critical age is the age of 
the maximum percentage 

The examination of an inlant results in a record ol obseivecl re- 
sponses Items w'ere placed in each sciiedtile according to their critical 
ages For con\cnicncc a schedule is given ioi each ol the following 
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ILLUS 27 CUP BEHAVIok 


12 weeks to 36 weeks 


Cp, 

. Behavior Items 

12 

16 

20 

24 

28 

32 

36 

1. 

Regards immediately 

Regards momentanly 

81 

89 

97 

100 

100 

100 

100 

2 

37 

22 

9 

3 

— 

— 

— 

3. 

4. 

Regards recurrently 

Regards prolongedly (n m p ) * 

44 

56 

68 

69 

28 

21 

4 

18 

IS 

5 

Regards prolongedly 

37 

57 

47 

3 

— 

— 

— 

6 

Regards predommantly 

73 

93 

94 

100 

100 

100 

100 

7. 

Regards consistently 

0 

5 

38 

66 

100 

100 

100 

8 

Shifts regard 

80 

73 

33 

23 

10 

25 

23 

9. 

Shifts regard to surroundmgs 

40 

33 

6 

3 

— 

4 

15 

10 

Shifts regard to hand 

47 

45 

13 

7 

— 

— 

— 

IL 

Shifts regard from cup to hand 

20 

28 

15 

3 




n. 

Arm mcreases activity (s p or n m p ) * 

75 

79 

78 

93 

100 

100 

100 

13. 

Brings hand to mouth (s p or n m p ) 

50 

21 






14 

Hands active on table top (s p orn m p ) 

36 

67 






15 

Approaches (mm p ) 

44 

79 

72 





16 

Approaches 

6 

25 

91 

100 

100 

100 

17. Approaches promptly (n m p ) 

25 

55 






18. 

19 

Approaches promptly 

Approaches after delay (n m p ) 

6 

25 

13 

23 

44 

81 

96 

100 

100 

20 

Approaches with both hands 

6 

11 

34 

41 

69 

so 

so 

21. 

Approaches handle first 

0 

5 

25 

38 

56 

64 

58 

22. 

Contacts (n m p.) 

44 

67 






23 

Contacts 

6 

15 

69 

91 

100 

100 

100 

24. 

Dislodges on contact (n m p.) 

25 

52 






25 

Dislodges on contact 


9 

53 

50 

38 

29 

8 

26. Grasps 



13 

52 

85 

100 

100 

27. 

Grasps with both hands (n.m p or s p ) 


5 

22 

53 

52 

36 

42 

28 

Grasps with both hands 



3 

28 

33 

32 

31 

29 

Grasps with one hand 



9 

35 

52 

75 

69 

30. Mampulates with hands encirchng cup 



6 

24 

56 

50 

42 

31. 

Manipulates graspmg by rim 



0 

0 

45 

64 

62 

32 

Manipulates graspmg by handle 



6 

31 

59 

92 

81 

33 

Pushes or hits 


14 

31 

30 

41 

3 

4 

34. 

Pushes or drags cup 



19 

24 

33 

50 

31 

35. 

Bangs on table top 


— 

3 

6 

37 

36 

58 

36. 

Turns cup over on table top 



6 

26 

14 

17 

8 

37. 

Lifts cup 



6 

45 

82 

100 

100 

38 

Lifts by handle 



6 

35 

59 

79 

81 

39. 

Brings to mouth 



3 

24 

63 

60 

66 

40 

Manipulates above table top 



0 

21 

67 

86 

89 

41. 

Manipulates mitially above table top 


— 

3 

3 

26 

18 

35 

42 

Holds with both hands 



3 

35 

63 

46 

46 

43 

Transfers 






19 

43 

42 

44 

Turns cup nght side up 



"d 

3 

56 

71 

62 

45 

Rotates 



3 



3 

21 

31 

46 

Drops 



6 

38 

63 

61 

42 

47 

Drops and resecures 




— . 

7 

IS 

39 

19 

48 

Fusses 

6 

7 

12 

27 

7 

18 

11 


• n m p = near median position s p = standard position 
(By permission of Gesell and Thompson (1938, p 109) and the Macmillan Co.) 
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ILLUS. 28 GROWTH OF CUP BEHAVIOR 

Per Cent 



7 Regards consistently 
9 Shifts regard to surroundings 

23 Approaches with both hands 
27 Grasps with both hands 

(Drawn from Gesell and Thompson, 1938) 

key ages 4, 16, 28, and 40 weeks and 12, 18, 24, and 36 months. Each 
schedule also includes typical behavior for ages a few weeks or months 
above and below the key age Illustration 29 shows the schedule for 
40 weeks Each schedule includes items designed to reveal four 
areas of development motor, adaptive, language, and personal-social. 
The text contains more than a hundred line drawings, as well as 
descriptive statements to help the examiner decide accurately the 
level of performance which he has observed. Thus according to Gesell 
and Amatruda (1947) the typical behavior at 40 weeks is: 

The forty-week infant sits with good postural control and without sup- 
port before the test table. 
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r 

He gives immediate heed to the first cube and seizes it with a radial 
digital grasp He transfers the cube and retains it as the second cube is pre- 
sented. He seizes this in a similar manner and holds the 2 cubes as the 
THIRD cube is presented. He approaches the third cube with a cube in 
hand, hitting or pushing the cube on the table, and he brings 2 cubes into 
apposition as though matching them 

In the MASSED cube situation he reaches for the screen, but then imme- 
diately approaches the mass with one hand and grasps a single cube, se- 
lecting the top cube or a coiner cube Holding one cube he grasps another, 
and brings the cubes into combination He releases a cube and exploits 3 or 
more in all with metliod and control 
The examiner now places the cup at the left side of the cluster of cubes. 
The baby grasps the cup by tlie rim, later he takes a cube and brings it 
against the outside of the cup. The examiner then drops a cube into the 
cup, and the baby reaches in and fingers the cube in the cup. 

He approaches the pellet with extended index finger and prehends it 
promptly with an inf e) tor pincer grasp 

Securing the baby’s attention to the maneuver, the examiner drops the 
pellet into the bottle and places the bottle on the test table JThe baby 
watches the dropping of the pellet, but his regard for the pellet in the bottle 
is questionable He grasps the bottle and mouths it If die pellet falls out 
he regards it on the test table but continues to exploit the bottle. 

The examiner then presents the pellet beside the bottle, the pellet on 
the right The baby reaches for the pellet first, grasps the pellet, drops it 
and then exploits the bottle 

He approaches the bell and seizes it by the handle He mouths the bell, 
transfers it, and spontaneously waves and shakes it 
The RING WITH string in oblique alignment is placed on the test table 
He reaches directly toward the ring first, then plucks the string easily, pulls 
in the ring, transfers the ring and manipulates the string 

The FORMBOARD IS placed with the round hole at the baby’s right, and the 
baby is offered die round block He pulls at the formboard (the examiner 
holds It firm), accepts the round block, transfers and releases it. The ex- 
aminer inserts the block in the round hole, and the baby pulls and pries 
at It and removes it with considerable difficulty He again transfers and 
releases the block. 

The test table is now removed and the baby is offered the ball. He mouths 
the ball and releases it but cannot be induced to respond to the examiner’s 
demonstrations and invitations to roll or toss the ball back and forth in 
cooperative ball play 

He is then confronted by a mirror He regards his image, leans forward 
and smiles and vocalizes as he pats the glass He is offered the ball which 
he accepts and retains, he disregards the mirrored ball 

POSTURAL BEHAVIOR is now obscrved He has already displayed his ability 
to sit with good control. Enticed by a lure, he goes from the sitting to the 
prone position. In prone he gets up on his hands and knees and creeps 
forward Holding a railing he pulls himself to his feet, stands holding on, 
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and lowers himself again. When his hands are held, he stands supporting 
his full weight. 

He VOCALIZES mama and dada, and has one other ''word/’ He imitates 
sounds (cough, click, razz) and responds to “no-no” and his name. 

He is REPORTED to hold his bottle and to feed himself a cracker. He pata- 
cakes and waves bye-bye/ 

The examiner enters a plus sign whenever the child demonstrates 
the behavior pattern, or the mother reliably reports its presence, or 
when more mature similar patterns have been displayed. He enters 
a minus sign whenever a child’s behavior fails to demonstrate a 
pattern. The child’s maturity level or age in any field is at the point 
where the “aggregate of plus signs changes to an aggregate of minus 
signs.” If the plus and minus signs are both found over a wide range, 
the range is indicated by giving the high and low limits, for example, 
''adaptive behavior) twenty to thirty-two weeks.” Four developmental 
quotients (DQ’s) may be secured by dividing the maturity age by the 
chronological age in each area, and a general developmental quotient 
may be found by averaging the four, if they are close together. 

Gesell has long stressed the need for appraising the whole per- 
sonality. More than two hundred pages are given to detailed 
descriptions of children who show amentias, endocrine disorders, con- 
vulsive disorders, abnormal neuro-motor signs, cerebral injury, blind- 
ness, deafness, prematurity, precocity, and environmental intarda- 
tion. Growth-trend charts, showing both temporary patterns and the 
patterns which normally replace them, are given in detail along with 
an excellent glossary. The examiner is urged to take into account all 
factors which influence behavior— illness, fatigue, apprehension, in- 
security, personality deviations, and language usage in the home, etc. 
The appraisal must be penetrating and must sum up “personal char- 
acteristics, integrity of organization, and latent and realized possi- 
bilities.” Two typical reports of examinations taken from Gesell and 
Amatruda (Developmental Diagnosis) 1947, pp. 151, 313) are given 
here: 

1, [He] had always been a very “good baby,” content to be left alone for 
hours. Development had always been slow. He was first examined at 3^ 
years of age. Left to his own devices he wandered about the room aimlessly 
climbing, screaming, whistling, and fingering objects idly. He did respond in 
some measure to loud, stern, insistent commands and could thus be induced 
to conform to the requirements of the examination. His general maturity 
level was approximately 12 to 21 months, DQ 45-50. He built a tower of 
cubes, dumped the pellet from the bottle, turned the pages of a book and 
placed all the forms in the formboard; he had no words. 

1 Arnold Gesell and Catherine S. Amatruda, Developmental Diagnosis, Normal 
and Abnormal Child Development, Harper Sc Bros. 
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At 5% years, after two years in a special school, he is controlled, obedient 
and “trustworthy,” having stabilized his activities and advanced in social 
adaptability Developmen tally, however, he has made essentially no prog- 
ress DQ 25-30. 

The decelerating developmental trend has reduced this boy from a high- 
grade imbecile level to a high-grade idiot level In a suitable environment 
and with skillful training mudi of his disturbing erratic beha\ior has 
disappeared He is another example of simple primary deficiency, 

2. [She] was taken off the maternity ward for adoption She was known to 
be the offspring of well-educated parents This fact has added support to 
the high opinion which we soon gamed of her developmental potentialities 
At 8 weeks she gave no evidence of advanced status; but at 20 weeks her 
performance proved to be definitely above the average Her drive was strong, 
tense, almost excited; her manipulation so active that it resulted in a two- 
stage transfer of the ring At 40 weeks her adaptive behavior almost attained 
a one-year level She made a determined effort to place the pellet into the 
bottle She took huge delight in the whole examination and displayed a 
mature kind of amiability in her cooperativeness At 2 yeais there was the 
same excellent rapport with the examiner Her responses were immediate, 
decisive and of excellent quality. Her performance was above the 30-months 
level in the motor and language fields In a nursery-school group at 2^4 
years she is credited with superior postural control, versatility in jungle- 
gym play activity, adeptness in solving mechanical problems, a delightful 
sense of humor, and a capacity to protect her own status without aggressive- 
ness in the social group. She is an attractive child with indubitably superior 
growth potentialities 

The validity of the testing schedule is thought to be high by its 
authors because of the way the schedule was developed. The method 
of selecting the infants, the allocation of the items to four fields of 
behavior, and the determination of a critical age for each item, on 
the basis of the percentage of infants who demonstrated the item — 
all these were arrived at through study. Neither systematic errors 
in applying or summarizing the scale, nor random errors in scoring 
have as yet been reported. The training of the examiner is very im- 
portant in securing comparable results. 

Along with these measures of behavior Gesell and Thompson 
(1938) listed the following fourteen direct measurements which are 
included in nearly all careful anthropometric studies. 

1. Length from soles of feet to vertex (total length) 

2 Length from soles of feet to suprasternal notch 

3 Length from soles of feet to pubes 

4 Biacromial diameter 

5 Thorax diameter 

6. Biscristal diameter 



86 


ACHIEVEMENT AND APTITUDE 


7 Head circumference 

8 Thorax circumference 
9. Weight 

10 Number of erupted teeth 

1 1 Head-neck length 

12 Body length 

13 Eye color 

14 Hair color 

From these direct measurements fifteen indices were found by 
dividing various measures by other measures. These specifications for 
measurement, described by Dawson (1936), were agreed upon at the 
14th International Congress of Prehistoric Anthropology and Ar- 
chaeology. Measures of infants were taken m the recumbent position, 
using a measuring board Older children and adults were measured 
in the erect position. 

GeselFs Development Schedules, 18 to 60 Months 

To measure older children, Gesell and others (1940) issued a de- 
tailed test manual and described their theoretical approach in their 
book. The First Five Years of Life (Harper & Bros ). This book -de- 
scribes normal behavior under four headings which may be roughly 
outlined as follows 

1. Motor Development the organization of movements, upright posture, 
walking and running, prehension and manipulation, and laterality and di- 
rectionality, 

2, Adaptive Behavior* block building, form adaptation, form discrimina- 
tion, drawing, number concepts, immediate memory, comparative judg- 
ments, and problem solving. 

3 Language Development. (1) developmental stages, jargon, vocabu- 
laries, phrases, sentences, understanding, articulation, (2) behavior situa- 
tions; picture books, use of language, parts of body, naming objects, fol- 
lowing directions, picture cards, analysis of pictures, action-agent, compre- 
hension, "What must you do? When?" prepositions, humor 

4 Personal-Social* eating, sleeping, elimination, dressing, communica- 
tion, play activities, aesthetic behavior, and developmental detachment 

For examination purposes printed test schedules are available for 
the following months. 15, 18, 21, 24, 30, 36, 42, 48, 54, 60, and 72. 

A complete examination usually requires several sessions on var- 
ious days in addition to interviews with parents The results are given 
in descriptive maturity-level ratings in each field, together with re- 
ports on marked deviations from these levels and a summary of th^ 
child s reaction to the whole test. Further characterizations are given 
which go beyond the psychometric findings Thus, for adaptive be- 
havior, evidences of unusual inquisitiveness, originality, or decisive- 



TESTxS 01^ EARLY CHILDHOOD 


87 


ness are recoiclcd, but no attempt is made to rate their magnitude. 
These characteriyations, ^\hich arc typical of the results ol pio]cctive 
techniques (Chapter XX 111), are suppoi ted by specific manifestations 
which may in rime \icld cpiaiuitative lesults Among the fifteen traits 
to be observed (p 307) are 

1 Erin^y output general amount and intensity of .ictuity 

2 Moto) clemcnno) postiue gciieial niuscnlai contiol and poise, motor 
coorchnarion and lacility ol motor .idjiistmcnt 

3 Sri f-(lr peri (ictue self-reliance and scll-sufbciency 

4 Social le^pomwcnrss rc.idion to other persons and to the attitudes of 
adults and of other children 

5 Family atlnchmeui closeness of affection, degree of identihcation 
with the family group 

6 Communicalwniess e\piessi\c lelerciKe to others In means of gesture 
and \()cali/ation 

7 Adaptivity capacity to adjust to nens situations 

8 Exploiialiori of cnviioiiment utih/ation and elaboration ol environ- 
ment .md cncumstanccs in order to gain iievs expeiiencc 

9 Snisr of humor sensitiveness and playful icactivcness to surprise, 
novelty, and incongruity in social situations 

10 Emotional adjmlmrnt balance and stability of emotional response in 
provocative situations 

11 Emotional rxpirsuvene^s. liveliness and subtlety of expressive be- 
havior in emotional situations 

12 Reaction to success expression ol satisfaction in successful endeavor 

13 Reaction to rrstiiction exprcssiv'cness of behavior in reaction to fail- 
ure, cliscomlort disappointment, frustration 

14 Readiness of smdnig facilitv and frequency of smiling 

15. Readiness of ciyni»' promptness and facility of irov\nnig and tears 
A sample of pan of a record fioin Gesell’s The Fust Five Years of 
Life (1940, p 307) is given below 

As early as the ages of 8 and 12 vseeks the highly dynamic personality of 
Boy D made a strong impression even when observed only through the 
medium of the cinema The follow ing acl|ectives were used to characten/e 
his individuality cjiiick, active, happv, friendly, well-adjusted, vigcjrous, 
forceful, alert, inquisitive Although he was definitely cxtiovertive he showed 
at the early age of 24 weeks a surprising disci iininativcness in reading the 
facial expressions of his mother Bv the age of 28 w'ecks he had developed a 
mcKlcratc temper technique tor influencing domestic situations which did 
not altogether please him. He was able to shift quickly in his cmolional 
response from smiling to crviiig and tiom crying to smiling to achieve a 
desired end \t the age of j years, likewise, his emotional reactions are labile 
and versatile He is facile in changing his emotional responses He is highly 
perceptive of emotional expressions m others, and correspondingly, highly 
adaptive in social situations With this emotional alertness, he shows a rela- 
tively vigorous detachment from his mother as well as affection for her. He 
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is not given to persisting moods We do not get the impression that his 
emotional characteristics have been primarily determined by his life ex- 
periences The underlying nature of his “emotivity” at 12 weeks, at 52 
weeks, and at 260 weeks seems rather constant With altered outward con- 
figurations a certain characteristicness in emotional reactions is quite likely 
to persist in his later life. 

Another important book by Gesell and his colleagues. Biographies 
of Child Development (1939), gives follow-through reports on thirty- 
one children for a period of more than 10 years and growth studies of 
fifty-one other children for shorter periods. The difficulties of ap- 
praising the same traits over various periods are shown, but certain 
marked consistencies are noted. 

Minnesota Preschool Tests, V /2 to 6 Years (Goodenough, 1932) 

This test is divided into verbal and nonverbal parts which may be 
given separately or together. To make it possible to retest the child 
without having the results of the second test affected by specific 
memories of tasks of the first test, two foftns have been provided. 

Both the verbal and the nonverbal senes are given to the subject 
with verbal instructions, but the comprehension of instructions is 
easy in those tests that have been classed as nonverbal. Since the 
author used few nonverbal tests for children under three years, the 
distinction between verbal and nonverbal tests is not considered to be 
useful at ages below three years 

A preliminary selection of tests to be used was made from a survey 
of previous work and from original materials. This preliminary series 
was applied to one hundred children in each of nine half-year chrono- 
logical periods Their parents represented a cross-section of occupa- 
tional levels. An analysis of the children’s responses was the basis for 
the final selection of items Items without intrinsic interest for the 
children, or causing a pronounced emotional reaction, or otherwise 
inappropriate were eliminated The twenty-six subtests given in 
lilus. 30 were retained and arranged in a somewhat random order for 
a standard presentation. It was thought that this order sustained in- 
terest by variety of task and by having easy items mixed with harder 
items The items were assigned various points of credit according to 
their difficulty. In scoring a test the points of credit are totaled, and 
then changed to MA’s or standard scores. The test items are ar- 
ranged for convenience in administration in a large book of envelopes 
with instructions and materials for each test placed together. 

Correlations are high between scores achieved on an equivalent 
form of the test given within a period ranging from one to seven days 
when computations arc made for various chronological age groups of 
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ILLUS MINNESOTA PRESCHOOL TEST 
Form A 

Maximum Scores 
Verbal Nonverbal 


1. Part of body ears chin 2 

2. Objects m pictures chair apple house flower 

3 Naming objects ball watch pencil scissors 

4. Copying drawmgs circle triangle diamond . , ^ 

5. Imitative drawing Honz stroke vertical cross . , __2 

6. Block building three cube pyramid six cube pyramid __2 

7. Response to pictures a nouns b nouns 

prep or verbs prep or verbs 

8 Cube imitation 1234 12342 1324 1423 14324 



9 Command drink 

10. Comprehension hungry sleepy house on fire . 

11 Discnmmation of forms number correct number wrong 

_10 

12. Naming from memory doll pencil penny horse 

shoe fork . 6 

13. Recognition of forms . a. b. c „ ^ 

14 Colors red blue pink white brown • , 

15 Tracing forms circle square irregular forms . ^ 

16 Picture puzzles horse goat apple camel . ^ 

17 Incomplete pictures * bird girl watch . , . 

18 Digit span 2 digits 3 digits 4 digits .... _3_^ 

19. Picture puzzles bird flower giraffe ^ 

20. Paper folding ^ 

21 Absurdities a b c. d e .... 

22. Mutilated pictures foot finger ^ 

23 Vocabulary a. b. c d e f g 

24. Opposites a. b c d e. f . ^ 

25 Clock 8 10 1 50 12 00 1 10 .... 

26 Speech dunng examination 

Scores added by Total verbal 80 

Checked by Total Nonverbal 55 


(Arranged from Goodenough, 1932 By permission of the Educational Test 
Bureau, Minneapolis, Minn ) 
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6-month intervals The correlations range from 68 to .94 for the 
verbal series, with a mean of .86. For the nonverbal series the correla- 
tions range from .67 to 92, with a mean of 89. This mean indicates 
that prediction from scores on one test for scores on a second test 
within a short interval will be fairly accurate Correction for practice 
effect is made by subtracting 2 C-score points from the second test, 
if the second test is given within the week. Goodenough (1932) states 
tliat the tests are about equally reliable at all ages within the scope 
of the test — ^from eighteen months up to six years. 

Goodenough Measurement of Intelligence by Drawing, 314 to 

13Vi Years 

Goodenough (1926) concluded, from her own observations and 
from a thorough study of others' research, that children's drawings 
could be used as an indication of intellectual development. She de- 
vised a test in which children draw a man from memory The ex- 
aminer gives the following directions to a group of children. 

On these papers I want you to make a picture of a man Make the very 
best picture that you can Take your time and work very carefully. I want to 

see whether the boys and girls in School can do as well as those in other 

schools. Try very hard and see what good pictures you can make 

Judicious praise in general terms is advised as, for example, "These 
drawings are fine; you boys and girls are doing very well.” Sugges- 
tions by the children are not allowed. The test usually takes no more 
than 10 minutes 

Goodenough chose a man as the standard subject for the drawings 
because it is one with which all children are familiar, it has universal 
interest and appeal, and a man's clothing is more uniform tlian that 
of a woman or a child. 

The children on whom the test was standardized were selected at 
random except for age-grade classification. They were within the 
normal range for in-grade-at-age 

The points used for scoring were chosen because they showed- 

1. A regular uncrease in the percentage of children succeeding at 
successive ages 

2. A rapid increase in this percentage 

3. A clear differentiation between the performances of children 
who were of the same age, but of different school grades 

Point credits were given for the presence of a line representing part 
of a man and also for correct proportion and perspective. The points 
are all described in detail and illustrated by specimen drawings with 
scores attached, as in Ulus. 31. It is possible to secure scores up to 
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51 points, which can be turned into mental age scores. Two points 
is the score lor the average child of three and one half years; 6 points 
is the score for a child of four and 

one half years, and 10 points is the 31 DRAWING OF A AN 

score for a child of five and one half 
years, on up to a score of 42 for a 
child of thirteen and one half years. 

Material for more complex psycho- 
logical analysis of the products is 
given in the book, but the test is 
chiefly used as a tentative classifi- 
cation of intelligence. 

The use of a single sample of draw- 
ing IS defended by Goodenough, who 
found a correlation of 937 ± .006 
between two tests on successive days 
for 194 children in the first gi'ade. 

In another instance, by computing 
the score by the split-half method, 
and by using the Spearman-Brown formula, a mean reliability of 
.77 was found for ages from five to ten taken separately. *'The prob- 
able error of estimate of a true IQ earned on the drawing test is 
approximately 5 4 points at all ages from five to ten years ** Other 
investigators have reported that drawings are very much influenced 
by a child's emotional conflicts and that the Goodenough scores are 
often affected by emotional blocking. 

Cattell Infant Intelligence Scale 

Psyche Cattell (1940) issued an Infant Intelligence Scale which 
was composed of items similar to those of the Gesell Development 
Schedules, the Merrill-Palmer Preschool Test, the Minnesota Pre- 
school Test, and Charlotte Buhler’s First-Year-of-Life Scales. Items 
from 1,346 examinations of 274 children were analyzed and placed 
in a scale on the following bases: 

1. Items for which successful responses showed significant increases 
over several 3-month periods and for which the responses finally 
approached 100-per-cent success. 

2. Items which were easy to administer and score, and required lit- 
tle subjective judgment Cumbersome apparatus was avoided. 

3. Items which were interesting to most children at the ages where 
the Items were used. 

4. Items which tested mental abilities rather than socially devel- 
oped skills or control of the large muscles. 



(Goodenough, 1928 By pei mission 
of the World Book Co) 
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5, Items ■which tested similar abilities at other ages. 

The final arrangement included five regular and two alternate tests 
at each of nineteen age levels ranging from two to thirty months. 

These tests were finally so placed as to yield Development Quo- 
tients similar to those secured by the same children in the Stanford- 
Binet Test Form L, at three or three and one-half or four years of 
age. At no age did the median IQ differ from the Stanlord-Binet IQ 
by more than two points. 

The median changes between two successive IQ's were much 
larger (about 1 1 points) before twelve months of age than after, when 
they were about 7 points The test at three months of age had a 
split-half reliability of .56 with the Spearman-Brown estimate. It 
correlated only .10 with the Stanford-Binet, Form L, given at thirty- 
six months. The predictive value of high ratings at three months 
was considerably better than that of low ratings. In other words 
poor scores could result from many factors, but high scores actually 
represented high ability. 

The tests at six, nine, and twelve months of age all had reliabilities 
of about .88, and at thirty-six months their correlations with the III-6 
Stanford-Binet scores were ,34, .18, and .56 respectively. Reliabilities 
and correlations with the Stanford-Binet were higher (.70 to .83) for 
ages above eighteen months. 

The variations in IQ are thought to be due to individual changes 
in growth curves or tempo of development rather than to inadequacy 
of the tests. The causes of some of these changes seem to be innate 
and some seem to stem from serious illness A great deal of work is 
needed to determine the probable rate-o£-growth trends in individ- 
uals and in homogeneous groups. 

The scoring is in terms of months of growth and, since each age 
level contains five items, the credit in terms of time for each item 
passed is one fifth of the period covered The tests at each level cover 
the preceding period, for instance, the test in the fourth month 
covers the penod between the third and fourth months. 

Cattell emphasizes that great care must be used in interpreting 
results. If an important decision is to be made, a second examina- 
tion should be given after several weeks, and a third examination 6 
months later This procedure is espeaally important when illness or 
lack of cooperation is evident. She believes that parents should seldom 
be told what the IQ of the child is. 

The California Preschool Schedule 

A scale known as the California Preschool Schedule, Macfarlane 
(1938), has been compiled of tests from Stutsman, Kuhlmann, Binet, 
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Terman, and local authors. It is important because it is probably the 
most analytical of the preschool scales and because it has been used 
in a thorough growth study. It covers ages from 15 to 84 months and 
has two equivalent forms. The record sheets classify the items into 
ten groups, as follows* 

1 Motor skills pegs, buttons, pins, bow knot 

2. Block building tower, door, design, stair 

3. Drawings scribbling, vertical and horizontal lines, circle, triangle, 
diamond 

4 Dtscrimtnatton of foim. cards, form boards 

5 Dtscnmination of spatial relation • cover on box, in, or under 

6. Discrimination of number and size large and small blocks, count ob- 
jects to 10, V 2 of 6, and 8 

7 Language comprehension* points to picture, action agent, opposites 

8. Language facility expressiveness, use of language, preposition, pro- 
nouns, past, plurals 

9 Memory span, finds object, memory for 9 objects 

10 Completion, nonverbal mends doll, watch, pictures 

This classification is descriptive of the materials and symbols rather 
than of the mental processes used. The scale probably does not have 
enough items m each class to make a reliable profile, but it does al- 
low one to compare 3 types of motor skills, 3 types of discrimination, 
2 of language use, and 2 of memory functions. Such comparisons will 
lead to a knowledge of the interrelation of these skills and eventually 
to more analytical scales. 

Primary Mental Abilities Test, Ages 5 and 6 

The most thorough statistical approach to measurements in early 
childhood is probably that of Thelma Gwinn Thurstone and L. L. 
Thurstone, who defined intelligence as a composite of abilities for 
acquiring knowledge of various types. After extensive studies of 
intellectual abilities over a period of 20 years, the authors found eight 
components of intellectual ability which they proved to be stable 
and independent characteristics of a person. After tests for adults 
had been developed, they turned their attention to five-year-olds. 
Seventy tests were constructed including all the known types of pre- 
school and readiness tests, and certain new varieties. Factor analyses 
were carried out for groups of two hundred children in kindergarten 
and first grade, and five primary factors were found which could be 
readily identified The purest measures of each factor were incor- 
porated into the Primary Mental Abilities Test for Ages 5 and 6. 
All the tests require a child to cross out or underline or draw lines 
in a booklet in response to oral directions by the examiner. In order 
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lo help children keep their attention centerecl at the right place on 
the page, a white cardboard marker is placed by the child just under 
the item being considered. Approximately seven practice and thirty- 
five test problems are given for each primary ability. The five tests 
are described as follows. - 

1. Ve'i bal Meaning, The ability to understand words is tested in 
five ways: 

a. Vocabnlaiy, In a row of four pictures the child is asked to 
cross out the one named. 



Put your marker under the first row of pictures. 
Mark the fruit. 


b. Sentence Comprehension. In a row of four pictures the 
youngster marks one which is the answer to a question. 



Slide your marker down. Which one is used to 
wake soldiers in camp^ Mark it. 


c. Sentence Completion. The child is requested to mark one 
of four pictures to show which fits into the blank in a sentence. 




' ? V » » 

Put your marker under the first row of pictures. 
Mark the picture that finishes this story: If 
you want to reach a book on a shelf and you 
have no ladder you may use a Mark it. 


2 These items are taken from the examiner’s manual giving directions and cor- 
rect answers, by permission of T. G Thurstone, L L Thurstone, and The Science 
Reseaich Associates. In the test booklet each picture is approximately one mch 
square 
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d. Paragraph Comprehension, The central meaning of a short 
statement must be illustrated. 



Put your marker under the first row of pictures. 

Mark the picture that goes with this story: 

After he had washed his face and eaten his 
breakfast Jack carried his book to school* 

Mark it. 

e. Auditory Discrimination. Two words that sound nearly 
alike are to be distinguished. 



Slide your marker 
down, ^ar and pear. 
Mark PEAR. 


2. Perceptual Speed, The ability to locate details quickly is meas- 
ured by 

a. Pictures 

In every row of pictures you are to do two 
things. First mark the picture all by itself 
in the little box. Then find the picture in 
the bigbox idiich is exactly like the picture in 
the little box and mark it too. Work fast. Do 
as many as you can on these two pages before 1 
tell you to stop. Are you ready? BEXjIN. 


(Allow exactly CNEAND CNE-HALF MINUIES from the 
time you say "HBGIN.”) 
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In every row of pictures you are to do two 
things. First mark the picture all by itself 
in the ring. Then find the picture which is 
exactly like the picture in the ring and mark it 
too. Work fast. Do as many as you can on these 
two pages before I tell you to stop. Are you 
ready? KIGIN. 

(Allow exactly "niD MINUIES from the time you say- 
•BEGIN'.J 

a a a ^ 


3. Quantitative A bility. This is the ability required for counting. 
It is measured by three sub tests. 
a, CountiJig 



Put your marker under the first row of pictures 
- the airplanes. Mark IHREE airplanes. HiFCE. 


fe. Comprehension of quantitative concepts 



Put your marker under the first row of pictures 
- the fish. Mark the FIRST and the LAST fish. 
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c. Story problem 



Put your marker under the first row of pictures 
• the shovels. Billy and George want to dig in 
the yard. How many shovels do they need? Mark 
them. 

4. Motor, This is a line-di'awing test. 

Put your marker under the first two rows of 
dots in the box. Ihey look like this. (Point 
on blackboard.) Someone has made some of the 
lines for us. We’re going to finish the row. 
We draw lines from the top dots to the bottom 
dots like this. (Illustrate on blackboard.) 
Now you do it in your book. Draw lines from 
the top dots to the bottom dots. Finish the 
row. (Give individual help where needed.) 

Move your marker down to the next two rows of 
dots in the box. See how FAST you can do this 
row but be careful. Draw lines from the top 
dots to the bottom dots. Be sure to hit both 
dots. 



5. Space, There are two tests for this ability. 

a. Squares, The child is requested to mark one of four draw- 
ings which would fit into the drawing at the left. 



Put your finger on the locket. Mark the first 
picture in the row. That is PART of a square. 
find the BEST of the square and mark it. 
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b. Copying. The child is to complete the picture on the right 
so that it looks like the picture on the left. 



Put your finger on the wrist«watch* Make the 
children’s drawing look just like the teacher’s 
drawing. 

The test is administered during two half-hour periods according 
to instructions given in a 24-page booklet. Groups of from five to 
ten pupils can be tested at once if they are coopeiative Large tables 
or desks close togetlier are not satisfactory, because children are too 
used to looking at each other's work Only tests for perceptual speed 
and motor factors are timed. The exact words to be used for each 
item are printed in the instructor's manual. The scores, which are 
the total numbers of items correctly marked, are recorded on a pro- 
file sheet which yields age scores ranging from three to seven years, 
in steps of two months for each subtest, and also for the total of four 
tests (the motor test is omitted in the total of mental tests). The 
ability-age scores may be divided by chronological-age scores to give 
quotients of development. For over-all success in the first grade the 
prognosis from these tests is: 

Over 6 yrs, 6 mos definitely ready 
6 years to 6-5 probably ready 
5-6 to 5-1 1 probably not ready 

5-0 to 5-5 definitely not ready 

Similar predictions of success in learning to read are possible from 
the verbal and the perceptual tests and of success in arithmetic from 
the quantitative ability tests. 

These primary-ability tests are very significant, because they yield 
scores which are unequivocal and which cover five important and 
relatively independent abilities. They give a research tool for im- 
portant studies of growth and the effects of training. They pilot the 
way toward more careful analytical approaches to evaluation at 
lower and higher age levels. 
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Measures of Reading Readiness 

A number of tests designed to indicate the development ■vs Inch 
is needed by those who are starting to leain to read have appeared. 
Such tests aie usually called leading readiness test’s Obseivations 
have shown that such indicaiors may Ijc lound, 

1 B\ measiiung the child’s ability in tasks ■which normally are 
formed piogrcssively better duiiiig die fiist year of scIiooL 

2 By secLiiing his intelligence test scoie 

3. By serin mg teacher latings ot interest in reading 

4 By measLiiing language ability, motor coordination, hearing, 
and Msual disci iiniiiation 

5 By finding how well inloimed the child is about his own en- 

Mionmeiit 

G. By learning the type ot emotional adjustment oi iiiteiest which 
he shows toward othei children, adults, inateiials in the school 
situation, and towaid hmnsell 

The greater number ol the tests lor reading lead mess are not com- 
pieheiisive enough to take m all these factors Usually the limitations 
of space, time, and money reduce them to paper-and-])cncil tests 
sampling a Jew indicators The Metropolitan Readiness Test, the 
Monroe Reading Aptitude Tests, and the Betts Rcady-to-Reacl Tests 
are typical 

Metropolitan Readiness Test. Although the ^^etlopohtau Readi- 
ness Test has been designed as a group test, it is suggested that the 
gioup be small, prelerably with fewer than ten children It may be 
given individually if necessaiy, though, ol couise, the same standard- 
ized dnections are to be followed Hildreth, Griffiths, and Orleans 
(1933) described their test as follows 

The icsl consists ol si\ pails I he first is a test of perception, imohmg 
recognition ol similarities It consists ol 23 ilenis larigcd in older of diffi- 
culty and including both pictoiial inalen.il symbols (Illus 32) 

Test 2 IS a second peiccption test, involving the copying of 11 figures 
This tv pc of test has proved to be highh diagnostic of mental matunt) in 
)oung clulclien, and several of the items are conip.irable with tliose con- 
tained in the Binci series The factor of icvcisals, which enteis into a niiniber 
of the items, has been found to be correlated with lack ol expeiience and 
with immaiunty ol perceptual abilities in young children 

Te'its 3 and 4 aie designed as measures ot vocabulary Ihey consist re- 
spective!) of 19 and 15 sets of lows of foui pu Lines each In Test H the child 
IS to select the putiiic that illustrates the word the examiner names This 
IS a test of undeistanding oi comprehension of language — not a test ol the 
child’s language usage Test 4 is similar in organi/ation, hut requires the 
child to conipreJiCiid plirases and sentences instead of individual words 
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The extra conversation, which is really not absolutely necessary for the 
location of the right picture by the child, is added to make the test one 
of more sustained attention comparable to the attention span required to 
listen to stories and the like, in the beginning work m reading 
Test 5 measures number knowledge By means of 40 items it measures 
achievement in number vocabulary, counting ordinal numbers, recognition 
of written numbers, writing numbers, interpreting number symbols, the 
meaning of number terms, the meaning of fractional parts, recognition of 
forms, telling time, the use of numbers in simple problems By using the 
picture form of material and certain checking devices, the varied aspects 
of number knowledge can be very satisfactorily explored 
Test 6 appraises common knowledge by means of a multiple-choice 
picture test of 16 items. The child is required to select from a row of four 
pictures the one that satisfies the examiner’s description. The test is short, 
but gives good evidence of variability within a group at first-grade entrance. 


ILLUS 32 SAMPLES FROM THE 
METROPOLITAN READINESS TEST 



TESTl, SIMILVRITIES 


☆ o 

(} 0 


TEST 2 COPYING 



TEST 5 NUMBERS 



(Hildreth, Griffiths, and Orleans, 1933 
By permission of the World Book Co.) 


Gentile noims for each 
half-year from five and one- 
half to eight years are fur- 
nished for total scores. These 
show slightly larger improve- 
ments on the part of the ad- 
vanced rather than of the re- 
tarded children during this 
period. The average child im- 
proves from a total 64 at five 
years, nine months, to 78 at 
seven years, nine months. 

Monroe {1935) Reading 
Aptitude Tests* The Mon- 
roe Reading Aptitude Tests 
include items similar to 
those in the Metropolitan 
Readiness Test and also 
measures of immediate mem- 
ory spans, articulation, audi- 
tory discrimination, speed 
and accuracy of hand and 
eye coordination, and lateral 
dominance Seventeen tests 
are grouped into six sections, 
each of which has a separate 
score A profile of six ability 
patterns is therefore avail- 
able: 
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A. Visual Tests 

1. Memory of orientation of forms The child must observe a diagram 
which the teacher exposes and then draw a line around the one, of 
two smaller pictures in his booklet, which is like it. The two small 
pictures are lateral or vertical opposites of each other The test was 
designed to detect perceptual reversals. Twelve pairs of pictures are 
used 

2 Ocular-motor control and attention This test requires one to indi- 
cate the path from a picture of a boy to one of three small pictures of 
houses Nine diagrams of increasing complexity are used. 

3 Memory of forms. This test asks a child to draw a set of four small 
pictures immediately after they have been exposed for 10 seconds. 
Four sets of pictures are used. 

B Motor Tests 

1. Speed. The child is asked to place dots in 14 inch printed circles for 
60 seconds 

2. Steaditiess The task is to draw a pencil line on a row of dots and 
dashes, without time limits. 

3 Writing name. The child is asked to write his name with each hand. 

C. Auditory Tests 

1. Word-discrimination. A small picture is presented and three words 
are pronounced. Thus, for the picture of a small sailboat, the words 
beet^ boatj and boot are spoken The task is to indicate which of 
the three words is the name of the picture. In all, nine pictures are 
presented 

2. Sound-blendmg. Three small pictures are presented at a time, and 
the examiner pronounces the name of one of them, separating vowel 
and consonant sounds The task is to blend the sounds into a word 
and to circle the picture which was named 

3. Auditory memory A short story is read aloud by the examiner, and 
the child is asked to “tell what the story was about.” Twenty-two 
ideas may be given separate credits, 

D. Articulation Tests 

1. Reproduction The child is asked to pronounce a series of twenty- 
four words or phrases, ranging in difficulty from baby to transconti- 
nental. 

2. Speed For this test the directions are, “I want to see how quickly 
you can talk When I say go, say banana, banana, banana, banana, 
as quickly as you can Keep on saying it till I say stop Ready, Go*” 
Fifteen seconds are allowed Similar trials are given, using long ago 
and take a bite. 

E. Language Tests 

1. Vocabulary Twelve rows of three small pictures each are presented. 
The task is to draw a circle around the object named by examiner. 

2 Classification. “Name all the animals you can think of as quickly as 
you can.” Thirty seconds are allowed. The same procedure is used 
for things to eat and for toys. 
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3. Sentence length A picture of a farmer an</two boys is shown while 
the examiner says, “Here is a pretty picture What is the picture 
about>“ The score is the number of words m the longest sentence or 
phrase used. 

F. Laterality Tests 

Score equal number of right-side preferences. 

1. Hand preference Writing, throwing a ball, pretending to comb hair, 
holding a bat (shoulder), threading a needle, winding thread on a 
spool, folding hands (uppermost thumb), and folding arms (outer- 
most arm) 

2 Eye preference. The child holds the large end of the paper cone to 
his face and is asked to look at three objects about the room The 
eye used for sighting is noted The eye used to peep through a small 
hole in a piece of cardboard which the child holds before his face 
is also noted. 

3. Foot preference Hopping, kicking an imaginary football, and climb- 
ing upon a low chair. 

Ten of the items may be administered to small groups, and seven 
must be given to individuals Norms for 434 children from five and 
one-half to eight and one-half years are given in smoothed percentile 
curves. The odd-even reliability of the entire test was found to be 
.87, but that of the separate sections was not given. The correlation 
between total Monroe scores and reading achievement one year 
later, as shown by a combined score on Gray's Oral Paragraphs and 
the Iota Word Test, was .76 among eighty-five first grade children. 
The separate sections of the Monroe test all correlated between .50 
and .60 with the combined score, with the exception of 18 for right- 
side dominance. Monroe suggested remedial procedures for aiding 
children with special difficulties. 

Betts {1936) Ready-to-Read Tests. All of the Betts Ready-to-Read 
Tests are given individually and require more complicated apparatus 
than those mentioned above. They do not include special measures 
of comprehension but do include the following four types of meas- 
ures: 

1. Visual Perception 

a Discrimination of letter forms 
h. Discrimination of word forms and vocabulary 
2 Auditory and Articulation 

a Phonetic perception of letter groups 
h. Auditory span for sentences 

c Auditory fusion of vowel and consonant sounds into words 
d Repetition of sounds 

e. Acuity of hearing Number read in a low voice at 20 feet 
3. Visual Mechanism: tested by a stereoscope 
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a Acuity of each eye and of both 

h. Binocular fusion and balance 

c. Depth perception 

d. Near-sightedness 

4. Lateral Dominance: hand, foot, and eye 

Measures similar to these have generally been found to show low 
correlations with reading success in the first grade, because motiva- 
tion, information, and reasoning ability may overcome mechanical 
and perceptual handicaps Auditory and visual handicaps, however, 
should be detected at an early age to avoid unnecessary strain. 

Summary of Descriptions of Scales 

These brief reviews are only enough to show that the items in- 
cluded in the various scales employ a wide variety of test situations. 
A clear understanding of test results comes only from a large number 
of direct observations of tests in actual progress and a thorough study 
of child development. 

Two methods of selecting items have been illustrated here In one 
stimulus situations were secured which were thought to indicate a 
factor of intelligence or to predict reading ability. In the other 
(shown in GeselFs work), there was at first a wide random selection of 
stimulus situations intended to give a fairly complete picture of 
behavior Later, items were grouped into classes which seemed to 
show related forms of behavior. Selection by means of careful statis- 
tical analyses of relationship has been accomplished in the Primary 
Mental Abilities Test. The scaling of items in all these tests is achieved 
by converting raw scores into mental ages. 

SOME RESULTS OF MEASUREMENT 

Typical results of measurements of preschool children will be 
discussed under two headings: usual order of development, and 
correlational studies of growth. 

The Usual Order of Development 

A number of observations have been made to show the order of 
development of behavior patterns. In the sphere of motor coordina- 
tion the results are fairly uniform from person to person on a number 
of patterns. Thus Shirley (1933) reported a definite order of emer- 
gence of motor patterns in locomotion, and Halverson (1931) re- 
ported approximate ages at which a definite series of prehension pat- 
terns developed. He described (p. 218) the stages as follows: 
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1) No contact includes all instances wherein infants for some reason fail 
even to touch the cube 

2) Contact only includes instances in which infants succeed in touching 
but not in securing the cube 

3) Primitive squeeze The infant thrusts the hand beyond the cube and 
corrals it by pulling it toward him on the table with thumb, wrist, or hand 
until he succeeds in squeezing it against the other hand or the body The 
primitive squeeze is not a, true grasp for the hand does not actually grip the 
cube. 

4) The first form of actual grasp is the squeeze grasp The hand, palm-in, 
approaches the cube laterally on the table to envelop it. At the moment of 
contact the fingers close on the cube so as to press it strongly against the 
heel of the palm with the thumb extended on the upper face of the cube. 
This grasp is clumsy and usually results in failure to hold the cube, at least 
no infant has succeeded in raising the cube from the table with this grasp. 

5) In the hand grasp the infant brings the pronated hand down pawlike 
fully upon the cube, curls the fingers down on the far face of the cube with 
the thumb paralleling the digits on the adjacent surface, and presses the 
cube firmly against the heel of the palm. The fingers appear to be of equal 
importance in grasping and the thumb seems to lack tonicity. 

6) The palm grasp is accomplished by setting the pronated hand down 
on the cube so that the fingers curl over the top and far down on the further 
face with the thumb pointing down against the near face to oppose the 
fingers in forcing the cube against the palm. We have now for the first time 
active thumb-opposition. This new feature in the tliumb repertoire of 
functions and the simultaneous budding into prominence of the forefinger 
are mainly responsible for the higher types of grasp which follow Up to 
this point all digits function in holding the cube in the middle of the palm. 
From now on only the first three digits function prominently in grasping 
so that we find the cube no longer in the middle of the palm but shifted to 
the radial edge of the hand. A faulty palm grasp becomes a hand grasp As 
a matter of fact, the hand grasp is often due to the failure of the thumb to 
orient itself properly for opposition to the other digits. 

7) In the superior-palm grasp the infant sets only the radial side of the 
palm down on the cube with the thumb against the near side opposing the 
first two fingers, which are curled down on the far side In closure the digits 
press the cube against the thumb and palm. 

8) The inferior-forefinger grasp greatly resembles die superior-palm grasp. 
There is the same thumb-forefinger opposition, but the digits at the end of 
the approach point more medianward than downward (tendencies of this 
change of pointing appear in the superior-palm grasp), an angle of ap- 
proach which makes thumb-opposition simple, and the cube is no longer 
pressed against the palm This type of grasp represents an achievement of 
no small degree for the infant, for here is a clear demonstration of the fact 
that the digits are beginning to act independently of the palm in grasping 
and holding. Heretofore, the infant uses the palm in grasping to make up 
for the shortcomings m gripping by the digits. Now he finds that he can 
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make the necessary fine adjustments of the proper digits to sec me his grasp 
and can maintain the delicately balanced pressure of the digits against the 
near and far faces of the cube to insure its being held. We nia> sa> that the 
advance in grasping from the palm type to the forefinger t)pe ni.uk^ the 
change from a three-surface grasp to a two-surface grasp — a iii-diicction*il 
pressure gives way to a di-directional pressure The inferior-forcfingei giasp 
is not a true fingertip grasp, however, for the cube is still 'vxell up towaid 
the palm so that a considerable portion of the palmar surfaces o( ihe digits 
contacts with the cube. 

9) The forefinger grasp is essentially a fingertip grasp The cube is well 
out on the ups of the first three digits (sometimes four digits) 'vvith the thumb 
opposed to the fingers Up to this point, all types of grasp recjiiiic tliat tlie 
hand come to rest on the table before the cube can be raised I'hus the 
table serves as a base or leverage point for lifting the hand aftei it giasp', 
the cube In the forefinger type of grasp the digits are pretty well extended, 
distinct flexion appearing only at the metacarpophalangeal joints Tii earlier 
forms of grasp the digits curl about the cube. 

10) The superior-finger grasp is similar to the forefinger gias]), except 
that the infant in grasping does not have to place any portion ol his hand 
on the table top to aid the placement of the digits against the tube, nor docs 
he require the table for leverage in raising the cube The hiind alights on 
the cube, attains its grip, and raises the cube deftly and neatly 1 rue, the 
hand may touch or brush the table as the hand settles on the cube, Imi the 
presence of the table about the cube is not essential to grasping or laising it. 

Similar studies also show order of growth in language patterns. 
Within the first few hours after birth there is, according to Lewis 
(1936), a period during which the cries accompanying discomfort 
and the sounds accompanying comfortable situations can be dis- 
tinguished by careful observers This is followed in the second oi 
third months by babblings. Babblings are repetitions ol comfoit 
sounds which seem to be produced because they give pleasure to the 
one making them. Lewis believes that this is a rudimeniais iorm of 
aesthetic experience which may profoundly affect the adult’s ap- 
preciations of music and poetry. During the first three or lour months 
there also occur imitations of adult speech in which there is only a 
rough similarity between what the child hears and what he produces 
According to Lewis, imitation of intonation is rare unless the child 
is experiencing the sam^ emotional pattern as the adult. 

At about the fourth month a stage follows which lasts from 4 to 6 
months, during which time little imitation is reported by many caie- 
£ul observers. This marked decrease in imitation is thought to be due 
to the fact that during this period the child develops responses to the 
meaning of what he hears and that these new associations prevent 
responses to mere sound There is also a development of variety in 
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vocalization. This period is followed by one in which a large amount 
of definite imitation of both fonn and intonation occurs. For the first 
time echolalia and imitation are noted Echolalia is a marked tend- 
ency for a child to imitate immediately what he hears in an apjiar- 
ently meaningless way. This is the kind of imitation that is character- 
istic of extremely feeble-minded children at later ages It also gives a 
child important practice in speaking adult phonetic forms. Persistent 
echolalia may prove a hindrance to the acquisition of language as an 
instrument, but it is a necessary stage of development during which 
the child gives close attention to the perception and vocal reproduc- 
tion of words. 

The growth of language from this point on is a matter of rapid ac- 
cumulation of phonetic forms and new concepts. A child compre- 
hends more spoken words than he uses m his own speech, and this 
doubtless lasts throughout life. Also, a five- or six-year-old under- 
stands written words before writing them. These processes, however, 
come in fairly close succession, and are dependent upon the concepts 
which the individual develops. Precise concepts lead to the rapid and 
accurate use of words, and this accurate use of words leads in turn 
to the development of more complex concepts The order of the 
development of concepts and modes of thought is not easy to ap- 
praise, since observations and test results usually record end-results 
and not the mental processes which produced them. 

Challenging and much challenged discussions of concept develop- 
ment are found in Piaget's three volumes (1926, 1929, 1930), in Stern 
(1914), in KoflEka (1928), and in the reports of psychoanalytical in- 
vestigators, for example, Anna Freud (1925) and Kanner (1935). 
These investigators have not reported standardized testing methods, 
but they have made detailed observations which are of great value. 

The results of the most careful testing of both language and non- 
language patterns are still far from complete, but the order of dif- 
ficulty of Items on standard tests is an indication of their most prob- 
able order of development in the individuals who were used to stand- 
ardize the test. From Ulus 27, Cup Behavior, one could calculate the 
probability that any item m the list would precede any other item. 
The Items which reach a maximum earlier will probably precede the 
items which reach a maximum later. The order of development will 
almost certainly be' regarding, approaching, contacting, grasping, 
and manipulating. 

The development of social behavior, also shows a fairly definite 
order in which simpler types of perception and memory precede the 
more complex coordinations and judgments. 
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Correlational Studies of Growth 

Correlation analyses of measures of small children at various ages 
have not yet given very clear results because the samples used were 
too small, and the tests were usually subject to large accidental 
variations. A number of correlations have already been quoted along 
with descriptions of the scales. The following reports are all frag- 
mentary but they indicate certain typical relationships, and they 
point the way fpr future research 

Bayley (1933) studied forty-nine infants from birth to three years, 
using items from scales by Kuhlmann (1922), Gesell (1925) and Jones 
(1926). Each infant was given at least six examinations. The result 
showed such low correlations between items and between trials that 
Bayley states, “The behavior growth of early months of infant de- 
velopment has little predictive relation to the later development of 
intelligence— even though the later behavior may depend in large 
part on the previously matured, elementary neural connections or 
behavior patterns.” Correlations between total scores on consecutive 
trials ranged from .71 to .89 with a mean at .82, but the correlation 
between age and test total during the first year was .98, These high 
correlations between consecutive trials were probably due to differ- 
ences in age level, rather than to similarities in mental organization. 
If the ages of the infants were held constant, the correlations between 
trials would become nearly zero. 

Illustration 33 shows the changes in IQ reported by Bayley (1940) 
from successive tests of one child over a period of 9 years. Bayley 
summarized a study of forty-eight children from one month to nine 
years of age as follows: 

In an attempt to find some measure of mental ability that would rate 
children consistently during the first nine years, several combinations of 
scores were studied. 

1, A Developmental Score made up of the sum of the items of the sepa- 
rate mental and motor tests yielded no greater consistency than did the 
mental test alone. 

2 A selection of items from the California Preschool Mental Scale im- 
proved slightly the correlations of the two- and three-year tests with later 
performance, but the improvement was not great enough to produce con- 
stant scores over four or five years. 

3. Tests of vocabulary given at from six to nine years of age gave scores 
moderately related to language tests at three and three and a half years, 
and were not significantly related to the age of first talking or to early 
mental-test scores. 

4 Tests of formboard and puzzle-board performance (at five and a half 
to eight and a half years), although related to mental-test and vocabulary 
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scores at similar ages, were unrelated to tests of abilify during the first year. 
It was concluded that mental organization changes with growth, and that 
the rate of change is especially rapid before two years of age 

Bayley also reported that correlations between the educational 
levels of the midparent (an average of both parents) and the scores of 
infants between one and fifteen months of age are slightly negative. 
With further growth, however, the relationship changed and at three 
years of age reached a positive correlation of .47. 

k 

ILLUS S3 GROWTH CURVE OF EARLY CHILDHOOD 



Note; This child varies from the average from 
+1.20 to -1.80 Standard Deviations by the 
age or 30 months. Variations of this 
degree were fairly common among preschool 
children. 

(Bayley, 1940, p. 25 By permission of the National Society for the 
Study of Education.) 

Furfey and Muehlenbeim (1932) used the Linfert-Hierholzer Scale 
(1928) with groups of infants who were six, nine, and twelve months 
of age. Later eighty-one of these children were tested by the Stanford- 
Binet (1932) scale at the age of four years, three months The correla- 
tion between the two scales was .00 ±.07 This zero correlation may 
mean that the two scales were measuring uncorrelated patterns. It 
may also be due to variations in individual growth or organization 
at various ages. Most of the tests for infants are heavily weighted 
with manipulative skills, whereas the Stanford-Binet requires much 
use of language at the age of four years. 
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Macfarlane (1938f concluded from a 4-year study of 244 preschool 
children that mental ability, as indicated on the California Pre- 
school Schedule, showed less predictability as the intervals between 
tests increased The correlations between tests separated by 6-month 
intervals ranged from .68 to .72. The correlations between tests 
given at twenty-one and seventy-two months were approximately .30. 
Macfarlane also found very low correlations between the child's 
mental-test score and midparent’s education. These ranged from .07 
at twenty-one months, to .12 at thirty-six months, to .33 at seventy- 
two months No simple explanation of these results is apparent. Since 
the test material included a wide variety of mental and motor skills, 
any simple interpretation is improbable. It may be that the content 
of the tests changed witli age, or that the organization of traits in the 
child became more stable 

A study of the intellectual status of 226 adults who had been tested 
on at least one Minnesota Preschool Test before six years of age 
was reported by Maurer (1946), who located these adults from the 
records of a group of 1,091, and succeeded in getting them to take 
a 21-minute revision of the Alpha Test She determined the correla- 
tions of total scores and also the predictive value of separate items 
in various preschool tests. On the basis of these correlations seventeen 
out of thirty tests were considered significantly predictive. The 30 
tests were classified by Maurer ® as follows 

Predictive Tests Nonpredictive Tests 

Imitative drawing * 

Block building (2 parts) 

Response to pictures (nouns) 

Knox cube imitation * 

Discrimination of forms ♦ 

Naming colors 
Tracing a form * 

Picture puzzles (rectangular) * 

Incomplete pictures 
Digit span 

Picture puzzles (diagonal) • 

Definitions 
Absurdities 
Mutilated pictures * 

Vocabulary 

Comprehension of directions 
Giving word opposites 

♦Nonverbal tests 

3 Lists of tests used with the permission of Katharine G. Maurer and the Uni- 
versity of Minnesota Press. 


Pointing out parts of the body 
Pointing out objects in pictures 
Naming familiar objects 
Copying drawings (borderline) * 
Response to pictures (verbs) 
Obeying simple commands 
Comprehension 
Naming objects from memory 
Aesthetic comparisons 
Recognition of forms (borderline) ♦ 
Paper folding ♦ 

Imitating position of dock hands 
Speech 
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ILLUS S4 GROWTH CURVE OF EARLy'^CHILDHOOD 



(Gesell, 1940, p. 152 By pei mission of the National 
Society for the Study of Education) 


Maurer points out that in general the predictive items required 
adaptability to a novel situation, while the nonpredictive items were 
too easily afiEected by recent experience or were too dependent upon 
motor or language skills. Naming familiar objects and pointing to 
parts of the body were thought to be nonpredictive because naming 
and pointing use rote memory only. Among the best predictive 
tests were: memory for digits, word opposites, picture puzzles, block 
designs, and comprehending directions. In a validation study using 
forty-six cases the correlations between total preschool items and 
Alpha scores ranged from .0 to .80 for small age groups (4 to 23 
persons) and were about .32 for the combined age groups. When 
preschool scores were recalculated, using only the predictive items, 
the correlations remained about the same, indicating that the non- 
predictive items were useless. 

Studies su<^ as this are of great significance, because they show 
consistencies in behavior over long periods of time and illustrate an 
approach which can be made more diagnostic by the selection of 
various primary abilities as criteria. 
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Gcsell (1929) repeated that a diagnosis of menial defect might 
safel) be made by an expert worker during the fiist year of the sub- 
]ect’s lile Ills reports are based on clinical studies and do not utilize 
con elation analyses either ot items or of total test scoics 

Illustration 34 presents one ol Gesell's typical records. Here the 
developmental age shows regular progress from two months to nselve 
years He* concludes 

The loregoino cases ’v\cic selected Iroiii some 10,000 nm\ on file at the 
Y£ile (dime ol Child neselopinent Numerous biogi.iphic case studies by 
members ol the stall base stieiigthencd tlie conclusion that the basic tiends 
and tempo ot behavior development, as a luJe, manifcbi themselves in in- 
fancy If a child has normal giosvth poientialiiies, it is almost certain that 
they smII leseal themsehes to clinical perception in the first two years of 
life Temporary '‘niegiil.irities” ol development arc more ficquently en- 
countered thioughoul the preschool \eais, because of the nascency and 
interdependencies ol behavior patterns dining this formative period In a 
fe^\ exLiemcly exceptional instances, bound uj> s\ith obscure emotional and 
physical faclois, the signs of normalit) may be delayed as long as thiee 
years On the othci hand viitiially every c.ise ol pnmaiy feeblemindedness 
can be diagnosed in the fiist )car of life In a wide clinical experience we 
base nescr seen a case of secondary feeblemindedness due to educational or 
cmironmental deprivation, although ue ha\e seen an occasional case in 
^vhich the IQ itscll had descended to an apparently delectivc level 

STUPY GUIDE QUESTIONS 

J What aspects of infants' growth aie measured by Gesell's scales? 

2 How IS cup belia\ior indicated and storceP 

3 What are the jinncipal components of postural behavior^ 

4 VV’^hat arc "increasing," "decreasing," and "foc.d” itcnis^ 

5 W'hat arc the achantages and disadvantages of total scores on pre- 
school scales- 

6 'W’hat is the basis for scoiing Goodenougb’s draw-a-inan test? 

7 Hov\ IS problem solving measured before thiee yeais? 

8 What skills are measured in readirig-readiness tests^ 

9 Compare the Minnesota, California, and Catlell preschool scales for 
range, analysis of tonfeiu, adequacy of sampling, and administiation. 

JO \Vhat singes in language rlcvelopment arc lecogiiircd^ 

11 'What pieclictions from one age to another are usual on infant scales, 
preschool scales^ icading-reaclincss tests- 

12 What explanations of variation in inchvidu*il growth are given^ 



CHAPTER VI 


INDIVIDUAL TESTS OF 
ABILITY 




Several important individual tests of general intelligence will be 
considered in this chapter. Their content will be presented, and 
the reasons for the selection of certain items and methods of scoring 
will be discussed. The principal uses of these tests are described, and 
a list of needed research activities is given. 


CHARACTERISTICS OF BINET-TYPE TESTS 

The individual tests that have been, and still are, applied most 
widely, both in this country and abroad, are principally of the 
Binet type. No accurate estimate of the number of such tests given 
annually is obtainable, but it can be safely said that they are used 
almost universally in studies of behavior difficulties and delinquency 
among children, and to a lesser extent in studies of adults 
Binet-type tests employ a wide variety of tasks designed to distin- 
guish between bright and dull persons. They are administered in- 
dividually, and oral directions are used. The order of their presenta- 
tion may be varied somewhat at the discretion of the examiner in 
order to secure a good sample of the subject’s ability These tests also 
have a total score and sometimes a profile of separate scores. The 
score is usually given in terms of MA and IQ, or corresponding cen- 
tiles of age groups. The test items are arranged according to mental- 
age levels, or simply according to relative difficult'*- 
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Early Scales 

Alfred Bine t, born in 1857, in Nice, France, studied medicine under 
Charcot and ¥ere, and with Beaunis founded the Psychological 
Laboratory at the Sorbonne m Paris in 1889 He was an eneigciic 
worker, and during his professional life he conducted a veiy huge 
number of technical researches. When in 1895 he became panic ularly 
interested in mental measurement, he and a student named Hcnn 
evaluated the tests which were then available. They decided that iheie 
was need for fewer tests of sensory processes and for more tests of com- 
plex thought processes. In 1898 Binet described current tests, most 
of which were not of his own invention, but which he thought would 
indicate accuracy of judgment and general mental ability He men- 
tioned the following tests specifically: mental calculation, diai\ing 
a square, reconstruction of disarranged sentences, comprelicnsion of 
abstract passages, questions of moral or social propriety, imincdidie 
memory for numbers and for objects, and imitation of paper lolding 
It is interesting to note that nearly all of these tests ^Mth slight 
variations are used in practically all of the more recent Bmet-type 
tests and also in many group tests of intelligence. Binet also em- 
phasized the idea that mental measurement should result in a rating 
of persons with reference to one another rather than in an absolute 
rating of ability. 

In 1900 Binet published “Attention and Adaptation,” which re- 
ported a study of the differences between two groups of children, 
five who had been classified as bright by their teachers and principal, 
and SIX, as dull. The children of both groups averaged approximately 
eleven years of age. The two groups showed no large differences on 
tests of simple reaction time or of Aoice reaction time, perception of 
small variations in metronome speed, immediate memory for words, 
and speed of counting dots. He did find large differences between the 
groups on tests of tactile sensibility on the back of the hand; copying 
letters, words, and designs; cancelling letters from a printed page, 
and addition. He concluded that the bright showed a quicker and 
more accurate perception and solution of difficulties, and he defined 
attention as mental adaptation to a new situation. He was con- 
vinced that speed of routine acts had no relation to intelligence 

After various studies of cephalic indices and 2-point thresholds on 
the skin, he again turned to mental tests and reported, in 1902, a 
large number of tests of free association of verbal processes on his two 
daughters. Marguerite, age fourteen and one-half, and Armande, 
age thirteen. No quantitative results were given, but Marguerite was 
clearly shown to be precise, practical-minded, well oriented in space 
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but not in time, and little given to reverie Armande was more 
imaginative, vague, detached, delighted in fantasies, and had a tend- 
ency to verbalism. 

In 1904 he worked with Simon in comparing feeble-minded and 
normal children and in 1905 they published a list of thirty tests 
which were arranged in order of difficulty. These items, shown in 
Ulus 85, were thought to depend more on broad cultural experiences 
than on specific academic training. 

On the basis of testing approximately fifty normal children and 
a larger number of mentally retarded children, called aments, tenta- 

ILLUS 35 THE 1903 BINET SCALE 

1 Visual co-ordmation of head and eyes 

2 Grasp a cube placed on the palm 
3. Grasp a cube held in hne of vision 

4 Make a choice between pieces of wood and chocolate 

5 Unwrap chocolate from paper 

6. Obey simple orders ; imitate gestures 

7. Touch head, nose, ear, cap, key, and string 

8 Find objects whidi the experimenter names in a picture 
9. Name objects pomted out in a picture 

10 Tell which of two lines is the longer 

1 1 Immediate memory for three digits 

12 Tell which of two weights is the heavier 
13, Suggestibility 

a. F^ind obj‘ect which is not among those presented, as in No 8 

b. Point to paiapoutn and mitchevo (nonsense words) m the picture No, B 
c Tell which of two equal lines is the longer 

14 Give definitions of house, horse, fork, and fnama 

15 Immediate memory for sentences of fifteen words 

16 Give differences between* paper and cardboard, fly and butterfly; wood 
and glass 

17 Immediate memory for thirteen pictures of familiar objects 

18 Immediate memory for two designs, exposed ten seconds 

19 Immediate memory for list of digits, three, four, or five in the series 

20. Give simdanties between blood and wild poppy, fly, butterfly, and flea ; 
newspaper, label, and picture 

21. Just noticeable differences in length of lines 

22 Arrange three, six, mne, twelve, and fifteen gram weights m order 

23 Find which weight has been removed from No 22 

24 Fmd rhymes 

25 Complete simple sentence by adding one word (after Ebbinghaus) 

26, Construct sentence containmg Parts, gitUer, fortune 

27 Knowledge of what is the best thing to do m twenty-five situations of graded 
difficulty 

28. Reverse clock hands at 3 : 57, at 5 40, and tell the time it would be 

29 Draw results of foldmg a piece of paper mto quarters and cuttmg the once 
folded edge 

30 Distinguish between liking and respecting, between being sad and bored 

(Peterson, 1925, p. 172 By permission of the World Book Co ) 
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tive noims ueie given 7’he first five tests woie found to be passed by 
idiots and tT\o-yeat-o]ds The ninth test was the upper limit for 
normal three-y ear -olds, the lour teen th test loi five-year-olds, and the 
fifteenth for adult imbcrilei» The sixteenth item sej^aiated five- horn 
seven-yeai-olds eflcctively Moions tell below the twelve-year level in 
the test, and were moic cleail) distinguished fioin normal children 
by veibal than by nonverbal tests The disci iniinaiioii betw'ccii higher 
age gioups was made on the basis of type of answer The aiithois 
realized keenly the need of more precise standards and nioie cases. 

Working independently in Woirester, Massachusetts, Tcinian 
(1906) published a compaiison of se\en supeiior and seven dull boys 
on tests of ingenuity, logical processes, mathematics, masteiy of 
language, inteipretation of fables, leaining to play chess, uiimediate 
memoiy for numbcis, foi iorms, and foi stories, puzzle solution, and 
motoi ability. A numbci of these tests Iiad been dcsciibed in earlier 
works, but most ol them w’cic originated by 'I’cnnan He lound that 
the dull boys, who were older and stronger, excelled the bright on 
motor tests, but fell behind them on the mental tc^ts Many of these 
tests w’cie standaidized later. 

At the Uniscrsily of Rome, also w’orking somesvhat independently, 
Sante dc Sanctis (1906) designed six tests to be used lor classification 
of the feeble-minded. Ills battery emphasized wliat are now known 
as perfoimance tests, in winch one had to select and inaiupulaie 
various colored balls and cubes and plane* figures. 

At the Vineland Tiaining School m New^ Jersey, Goddard (1909) 
collected and devised twxnty-five tests to be used in tiic diagnosis of 
feeble-mindediicss before he saw Binet's w'ork 

In 1908 Binet incorporated the mental-level concept into a new 
scale of filty-ninc items Thiee to eight items w'cic selected to lepre- 
sent each age level lioin three to thirteen years Each rest was as- 
signed a certain number of months’ credit, and a pei son’s scoie W'as 
the total ycais and monllis of mental age winch he had earned The 
scale was arranged so that the average child ol a parti culai chrono- 
logical age would have a coircsponding mental age. 

This work w'as cagcily lead Translations of the 1908 scale w’ere 
applied ill Italy by Ferrari (1908), and by Treves and Soffiotti (1909), 
in Switzerland by Descocuclies (1911), iii Gcimany by Bobertag 
(1912), 111 America by Goddard (1911), Kuhlmann (1911), Termaii 
and Childs (1912), and in England by Johnson (1911) All of these 
investigatois lepoited that Bmet’s tests resulted in too laige a per- 
centage of children rated as :»Lipeiioi below' the age ol ten years, and 
too small a percentage in the higher age levels 
Stern (1914), how'ever, considered that thi'» agreement of findings 
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from six diflEerent countries using different languages and testing 
children from different environments was strong evidence that the 
tests measured the ‘‘general developmental conditions of intelligence 
— and not mere fragments of knowledge and attainments acquired by 
chance/' 

In Germany Stern (1900) published a book on differential psychol- 
ogy and Meumann (1905) made an extensive review of test literature 
which indicated notable advances in Germany before Binet's first 
scale became widely used Meumann pointed out the need of an 
indication of rate of development, because the older feeble-minded 
children were more retarded in mental age than the younger To 
meet this need, Kuhlmann (1912), Stern (1914), and Yerkes, Bridges, 
and Hardwick (1915), apparently working independently, suggested 
indices to be found by dividing an individual's achievement by the 
age-group average. Terman (1916) divided the MA by the CA and 
called the resulting index an IQ. 

Terman became so much interested in the 1908 Binet scale that he 
applied a translation of it to four hundred nonselected children. He 
was then convinced that the scale was practical in spite of imperfec- 
tions in sampling and scaling techniques. He set about to correct 
some of these defects. The idea, which Binet and others had often 
repeated, that an adequate measure of intelligence can only be se- 
cured from a variety of tests was accepted because the pooled results 
seemed to distinguish between bright and dull better than any one 
single type of test. 

Binet's last scale appeared in 1911, the year in which he died. Sev- 
eral reading and writing tests which he believed depended too much 
on special training were omitted. He relocated many of the items, 
particularly in the higher age groups. These changes resulted in a 
scale which had five items at each age from three years to adult, 
omitting the eleven-, the thirteen-, and the fifteen-year levels. 

Terman’s Revisions 

In 1916 Terman published a revision of the 1911 Binet scale upon 
which he and his associates had been at work four years. The original 
items were restated to remove ambiguities in administration and 
scoring The scale included tests for children of three years to adults 
of superior ability. It was rapidly adopted as the standard individ- 
ual test of intelligence by most of the schools and clinics in America. 

Terman’s 1916 scale was replaced after 20 years of wide application 
by a revision published in 1937 by Terman and Merrill. Most of the 
items of the 1916 scale were retained and the scale was extended 
down to the two-year level, and up to the twenty-two-year level. The 
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1937 revision provides two equivalent forms, called Forms L and M 
(see Ulus 36). Both forms contain tests for each 6-month level from 
two to five years, and for each 12-month level from five to fourteen 
years, and for four higher levels. The four highest levels are called 
the Average Adult, Superior Adult I, Superior Adult II, and Superior 
Adult III An inspection of Ulus. 36 gives a rough idea of the types of 
processes used in these scales. At the earlier ages, objects, pictures, 
and parts of the body are more often included in tlie test situation 
than at later age levels. Tests of immediate memory spans for words 
and numbers are found throughout the scale At the higher levels 
are found more items which seem to depend upon abstract verbal 
and numerical reasoning The reasoning elements at any age level 
seem to the writer to contribute less to the average score than recall 
of learned factual material A large box of standard materials is used 
at age levels from two to six years. Some of these materials are shown 
in Ulus. 37 Cards with standard pictures and other printed material 
are also used at later ages 

The authors discuss the tests qualitatively. Thus, on page 236 
verbal absurdities at the eight-year level are evaluated as follows: 

The detection of absurdities has again shown itself to be one of the most 
valid and ser\'iceable of our tests. No other test in the scale, with the excep- 
tion of vocabulary, yields more consistently high correlations with total 
score These range from .72 to 75 for a single age group. The test appears 
to be little influenced by schooling or by differences in social status 

The purpose of the test is to discover whether the subject can point out 
the intellectually irreconcilable elements of the situation presented. The 
only difficulty is in judging whether the response shows that the subject has 
seen the incongruity. The child who has seen the point instantly often 
indicates that fact by repeating the critical phrase, and the dull, uncom- 
prehending child may just try to say over what you have said . . . 

In discussing the Memory for Designs Test in the ninth year (p. 
248), the authors write. 

The figures are, of course, perceived as meaningful wholes, that is, the 
lines of the figures constitute designs and, in so far as they are recalled, 
are recalled as related Whatever may be the processes involved — attention, 
visual memory, kinesthesis — it is very certain that they differ from one indi- 
vidual to another and that the dependence, for instance, on visual cues is 
more marked in some children than in others It is often possible to note 
the utilization of kinesthetic cues as the child practices the designs with 
pencil in air during the ten-second exposure interval 

For half credit all of the elements must be present, but inaccuracies due 
to omission or addition of details or to irregularities in size and shape of 
the figures are overlooked The samples on pages 250—51 indicate the stand- 
ard for plus, half credit, and minus. 
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fC 

ILLUS. 36 REVISED STANFORD-BINET TESTS OF INTELLIGENCE 

Partial Outline of Form L 

Yeas II 

{6 tests count i month eack^ or 4 *testSf months each ) 

1 * Place three small blocks into similar holes m a board 

2 Point to toys 'when their names are given 

3 * Point to parts of a large paper doll when parts are named 

4 Build a four-cube tower after demonstration 

5 • Name common objects from separate pictures 

6 * Use a two-word sentence spontaneously. See kitty 

Alternate. Obey simple commands to manipulate small toys. 

Year III-^ 

(6 tests coutU I month each^ or 4* tests count months each ) 

1 * Ob^r simple commands to manipulate small toys 
2. * Name common objects from separate pictures 

3 Point to the longer of two sticks 

4 Name at least three objects shown in one picture 

5 * Pomt to objects to indicate use Show me which one we dnnk out of 

6 * Tell what to do m common situations 

Alternate Draw a cross with a pencil after demonstration. 

Year VI 

(6 tests count 2 months each, or 4* tests count 3 motUhs each ) 

1. • Define five words orally by description, use, or classification 

2 * Make a simple bead-chain pattern from memory after a demonstration 

3 Tell what part is missing from four pictured objects. 

4 * Select certain numbers of blocks from a pile 

5 * Point to one of five pictured objects which is different from the rest 

6 Draw a penal line through a sample maze to mi^e the shortest path 

YearX 

(d tests count 2 months each, or 4* tests count 3 months each ) 

1. * Define eleven words orally 

2 Explam why the pictured actions of a person are foolish 

3 • Read a passage of 48 words, then recall from memorv a considerable portion of it. 

4 * Give two reasons to support an oral statement 

5 * Name as many disconnected words as possible in one mmute. 

6 Repeat six digits after one oral presentation 

Average Adult 

{B tests count 2 months each, or 4* tests count 4 months each ) 

1. * Define twenty words orally 

2 * Transcribe a short message in a code which is exposed. 

3 * Give differences between two abstract words 

4 Read short anthmetical problems and give answer without usmg paper and pencil. 

5. Tell what proverbs mean m own language. 

6 • Give oral solution of a practical mechanical problem which is presented orally. 

7 After one oral presentation, repeat a 24-syllable sentence without error, 

B Tell m what way verbal opposites are alike 

SuREUOR Adult 

(d tests count 6 months each, or 4* tests count g months each ) 

1. * Define 30 words orally. 

2. Read aloud a problem concerning direction and distance travelled, and give answers without iwfag 
paper and pendl 

3 * Give opposites of words by analogy. 

4 ’ Watch examiner fold and cut a piece of paper, then make a pencil drawing to show how the paper 
would look if It were unfolded 

• Read silently while the examiner reads aloud a simple geometric progression problem, then give 
answers without usmg paper and pencil 

fi* Repeat 9 digits after one oral presentation 

* Tests mcluded in shorter form. 

(Arranged from Tcrman and Merrill, 1937. By permission of the 

MifiSin C3o.) 
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ILLUS. StAtANFORD-BINET TEST MATERIAL 



(tennan and Merrill, 1937. By permission of the Houghton Mifflin Co.) 


Kuhlmanns Revisions 

Kuhlmann (1922) brought out a revision of the 1911 Binet upon 
which he and some of his staff had been working seven years. It was 
a remarkably thorough piece of work, which extended the scale down- 
ward to the three-month level and up to a mental age of fifteen years. 
He included 129 items, counting a test once for each age group in 
which it is used. Of the original Binet tests thirty-seven were retained, 
but a number of these were modified or shifted to other age levels. 
The number of tests in each age group above two years was increased 
$0 eight. Credit was allowed for both speed and accuracy on many 
ifenis. 

Kuhlmann*s (1939) Tests of Mental Development is the work of 
the staff of the Division of Examination and Classification of the 
Minnesota State Department of Public Instruction. About three 
thousand nonselected public school and preschool children were 
used to evaluate 121 tests which had been chosen from various 
sources, particularly Kuhlmann 's earlier test and GeselFs (1928) 
studies of infancy and human growth. Eighty-nine regular tests and 
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nineteen supplementary tests were finally included in the scale on the 
basis of four criteria. 

1. Large increases in raw scores from age to age, in those tests which had 
several elements 

2. Large increases in per cents of children in successive age levels who 
pass a test (in those tests which were either passed or failed) 

3. Variability in raw scores in a single age group (a wide distribution of 
scores in one test was considered desirable) 

4. Correlations of single test scores with total scores (high correlations 
were preferred) 

The scores below the eight-year level, that is, on the easiest sixty- 
three tests, were the number of items correct. On more difficult tests 
the score used was speed-times-accuracy. Speed was usually calcu- 
lated by dividing the right-minus-wrong elements by the time in 
seconds. The tests were usually limited to one or two minutes. Ac- 
curacy was calculated by dividing the number of correct elements 
by the number attempted Multiplying speed by accuracy penalized 
the inaccurate worker. This penalty was imposed by Kuhlmann when 
he found fairly large negative correlations between speed and ac- 
curacy, although both were related to intelligence and age 

An innovation in this test battery is the use of a point scale de- 
signed by Heinis to give equal credits to equal units of growth. 
Each score on each test is changed by the use of conversion tables 
to mental units, called MU points. A person's total score, the sum 
of these points, can be changed to his mental age by another table. 
IQ's and Per cents of the Average (PA) ^ may be obtained from an 82- 
page table. The PA is preferred by Kuhlmann since it has proved to 
be more constant than the IQ's in retests over a period of years. He 
also found that age groups had more constant standard deviations 
for PA than for IQ scores. The standard deviation for PA's was eight 
points, approximately one half that of the IQ's. Kuhlmann believed 
that this smaller range of PA's is a better representation of actual in- 
dividual differences in intellect than was the larger range of IQ's. 

Point Scales 

In 1915 and in 1923 Yerkes and others published a revision of 
Binet's work which did not assign tests to mental-age levels, but al- 
lowed credit of various points for each item. The total points could 
be converted into a corresponding mental age by reference to a 
table Point scales are well illustrated by the work of Baker and 
Leland (1935), who published a battery of nineteen tests, each of 

1 The PA IS an index secured by dividing a person's MU points by the average 
MU points for his age group 
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which contained from 14 to 360 points. This battery includes the 
main types of items which are included in the Stanford-Binet Forms, 
and in addition a hand-and-eye coordination test. It needs a little 
more time for administration than the Stanford-Binet, but it has the 
advantage of including more items of each type, thus insuring a bet- 
ter appraisal of the individual, and of allowing a profile of results 
which shows a person’s strong and weak points This analysis is par- 
ticularly valuable in remedial work and counseling. 

In criticism of this scale it should be mentioned that the stand- 
ardization is not so complete as is desirable, and that the profile does 
not indicate traits which are known to be independent or unitary. 
Profiles such as this probably represent fairly independent behavior 
patterns, however, and are often more useful than a single mental- 
age score. From analysis of such profiles knowledge of independent 
variables will eventually come. 

The Wechsler-Bellevue Scale for Adults ^ 

The Wechsler-Bellevue Scale is a widely used individual point 
scale lor the examination of adolescents and adults. The test items 
have been selected to appeal to adults and to sample their abilities. 
Wechsler (1944) states that the purpose of the scale is twofold: to 
measure intelligence and to show diagnostic patterns of subtests. 

He defines intelligence as “the aggregate or global capacity of the 
individual to act purposefully, to think rationally, and to deal eflEec- 
tively with his environment.” He gives three reasons for thinking 
that intelligence is not the mere sum of abilities. 

1. The ultimate products of intelligent behavior are not only a function 
of the number of abilities or their quality but also of the way in which 
they are combined, that is, of their configuration 

2 Factors other than intellectual ability, for example, those of drive 
and incentive, enter into intelligent behavior. 

3 Finally, while different orders of intelligent behavior may require 
varying degrees of intellectual ability, an excess of any given ability may 
add relatively little effectiveness to the behavior as a w^hole 

The Tests. The scale consists of eleven types of tests, all of which 
are old in style, but new in content. Wechsler developed contents 
which seem more interesting and certainly more thorough than those 
of earlier tests. Six of the tests are termed verbal, and five performance 
tests. The latter may be applied advantageously to those with lan- 
guage or hearing handicaps, and the former to those with defective 

2 Wechsler has recently published a scale for children which is similar in con- 
struction 
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Vision The verbal tests, all of which use oral directions and answers, 
are listed below. The correlations for ages twenty to thirty-four be- 
tween scores on the subtest and scores on the whole scale are given 
in parentheses. 

1. Infoimation Twenty-five questions, 7 involving geographical location 
and distance, 3 authorship, 3 measures, 3 hard definitions, and one 
each in aviation, date, inventor, name of the President, average height, 
population, etc The difficulty range is large (r = 66) 

2 General comprehension Twelve questions involving knowledge, 
judgment, and attitudes Such as, “What would you do if you found a 
letter that was already sealed, stamped, and addressed?*’ (r = 66) 

3 Authmetical reasoning’ Ten problems, to be done mentally, involv- 
ing simple language, analysis of a problem, and computation (r. = 
63) 

4. Digits forward and backward The score is the longest span (list of 
digits) to be repeated correctly, similar to the digit spans in the 
Stanford-Binet Scale, (r = 51) 

5. Verbal similarities Twelve pairs of words, such as egg — seed, for 
which one common aspect must be mentioned, as m the Stanford- 
Binet Scale (r= 73) 

Alternate A 42-word vocabulary test ranging in difficulty from apple to 
traduce (r = 85) is the alternate test in the above series of verbal tests. 
The performance tests include 

6 Picture completion* Fifteen cards, on which pictures of common ob- 
j*ects are lacking essential parts. The examinee must tell which parts 
are missing, (r = .61) 

7. Picture a'lrangemenis 'Six senes of cards similar to comic strips to 
be arranged in temporal order The shortest series has three cards, and 
the longest six cards, (r = 51) 

8. Object assembly. Three pictures cut apart a manikin, a feature pro- 
file similar to those used by Pmtner and Paterson (1917), and a hand 
(r= 41 to .51) 

9 Block designs. Seven designs similar to the Kohs (1927) Senes Only 
red and white colors are used, (r = .73) 

10. Digit-symbol. Similar to the U S Army Beta Test No 4 (Ulus. 9) 
This IS a paper-and-pencil test in which one must write in the squares 
symbols which correspond to the given digits as shown in the key at 
the top of the test page (r =: .63) 

Interpretation of Tests. All of the tests have been standardized 
on age groups of normal persons IQ*s are given for each 3-month 
period from ten to fourteen years, and for fifteen- and sixteen-year 
groups, for a group from seventeen through nineteen years, and for 
eight five-year groups from twenty through fifty-nine years. In each 
of the original groups there were from 50 to 175 subjects. 

The scoring of this scale, like that of all oral individual scales, is 
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rather iiitncate. Ihc lesponses for each ifeni yield 7ero, half, or full 
credit, or \aryiiig poiiils accoidiiig to the examiner’s judgment ot 
the correctness and coinplctcncss of perloiniancc The 22 pages of 
criteria for scorir^g leave many close derisions to ihc examiner’s 
judgment The law scoies loi each ol the tests arc given appioxi- 
matelv equal i\eight, by coiueiung them into points on a scale tor 
which the mean is arbitraiily set at 10, arid the standard deviation 
(SD) at 3 points The sum ol the verbal-test scores yields a verbal 
iVIA and IQ, and the pertoimancc tests )ielcl sinulai indices A total 
MA and IQ aic also available 

All of these indices are <‘CCiiied ftom tables which have been jire- 
pared so that the mean IQ uill be 100 and the Piobable En oi (PE) 
10 Since an SD is 1 4826 PE in a noimal curve, the SD is approxi- 
mately 15 lor these IQ’s Ihis is dightly sniallci than the SD of 17 
used by Ternian and Men ill HOST) The total split-half leliability 
collected lor attenuation was 90, and the coi relations bet^\ceii ver- 
bal and jjerioimance sections coirected lor attenuation was 83 in 
a giOLip ol 3'j> adults 'Phe median correlation ol sepaiate tests 
w'lrh total scores Tvas ajiproxnnately 05 for this group, v\ith Vocabu- 
lary Similariiies and Block Designs show'ing the highest correlations 
(from 70 to 85) and Object Assembly and Digit Sjxin the lowest 
(from 41 to 50) 

Inlclhgcme Qiioiient Redefiiied ^Vhilc retaining the IQ, 
^Vechsler ledcfincd it simply as the place of an individual in a group 
oi individuals of approximately the same age The aveiage loi any 
age will thus be 100 and the piobable error, 10 points Standatd 
scoies or cen tiles have thercloic a constant iclanonsfup with IQ’s. 
In dehning IQ m thi^ wa), ^V^echslei has ceased to use the concept 
of mental age tor calculating purposes For practical pui poses he 
shows scoies coiicsponding to age-group avciages so that ones abso- 
lute ability as well as the relatne position in the group may be given. 
Illustration 42B sliovss the classification and distributions 

CONSTRUCTION OF BINET-TYPE TESTS 

Tv\o mam ijroccduics arc ncccssaiy in the construction of any 
mental test the selection ot items and the scaling ol items 

Selection of Items 

The piinci])al criteria used in selecting an item foi Binet-type 
scales arc desci ibcd below. 

Discnmniation betxoeen Bright mid Dull Cliildien In spite of 
the mass of data available, little has been published lately to show 



124 ACHIEVEMENT AND APTITUDE 

which items really discriminate best between normal and defective 
groups of the same chronological age. Burt (1922), Merrill (1924), 
and McNemar (1942), however, have confirmed the earlier work by 
showing, in scores of groups selected on the basis of school progress, 
large mean differences on the usual tests of facility in the use of 
language and number symbols, and smaller mean dilBEerences on 
motor and performance tests. 

How Items are selected is shown in Ulus. 38, which compares the 
percentages of persons of the same chronological age in three IQ 
groups who passed each item listed. This shows that the first item 
m the sixth year (distinguishing right from left parts of the body), 
was passed by 43 per cent of the six-year-olds whose IQ*s were below 
96, by 70 per cent of the six-year-olds whose IQ’s ranged from 96 to 
105 inclusive, and by 89 per cent of the six-year-olds whose IQ’s were 
above 105. A glance at this illustration shows that there are fairly 
large differences among these IQ groups on nearly all items Items 
which did not show as large differences were eliminated by Ter- 
man. Selection of items in this fashion undoubtedly increases the 
internal consistency of a test 

In this illustration the indications of brightness are IQ’s which 
were determined from the test items themselves. Hence the differ- 
ences between IQ groups are doubtless somewhat larger than would 
be found among groups which were selected for relative brightness 
by some other method, for example, a teacher’s ratings 

A Correlation between Item Scores and Total Scoies Terman 
and Merrill (1938) and McNemar (1942) published correlations be- 
tween Item scores and total scores. The highest are those for vocabu- 
lary which, for various age groups, range from 65 to 91, median 
.81. The low^est correlations reported are for block counting, .43; 
counting taps, .50, motor coordination, .46; and picture absurdities, 
.56. 

Fourteen detailed factor analyses of many of the 1937 Stanford 
Revision items are reported by McNemar (1942) In all fourteen 
matrices a first common factor accounts for from 35 to 50 per cent 
of che variance of various items A second factor accounts for from 
5 to 1 1 per cent, and a third for from 4 to 7 per cent The first factor 
seems to be nearly the same at various ages and to be a combination 
of general thinking and verbal knowledge, which is heavily em- 
phasized The second and third factors appear to be somewhat dif- 
ferent at different ages At the earlier ages there is evidence of a 
motor factor, and at the later ages there is evidence of a verbal, num- 
ber, memory, or problem factor. These results seem to be almost the 
same as the results obtained by Burt (1922) in a similar study. 
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ILLUS. 38. RELATIVE DIFFICULTY OF ITEMS FOR NORMAL, RETARDED, 
AND SUPERIOR SUBJECTS 


Per Cents of Sii^le-Year Groups 
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CA. from Terman et 1917. By permission of the Editor, Educationd Psychology 
Monographs; B. from Merrill, 1924. By permission of the Editor, Comparative 
Psychology Monographs.) 
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Kuhlmann (1939) published correlations between total scores and 
separate items for age groups of approximately 150 pupils. The 
median was approximately 45. There was a marked tendency for the 
harder tasks to show smaller correlations than the easier Thus, the 
tests used below the age of three years had a median correlation with 
the total score of approximately 91, from three to five and one half 
years, of 50, and above five and one half years, of 39. This finding 
may indicate a much greater internal consistency at the lower ages, 
or it may be due to other factors, such as the length of the tests and 
the range of scores in a group. 

The results of both Kuhlmann and Terman lead one to suspect 
that a number of highly independent factors are sampled by these 
tests — 2 i hypothesis discussed in Chapter VIIL 

Disciimination among Adjacent Age Groups Illustration 39 
shows the selection of items which distinguish between adjacent 
chronological age groups, and the percentages of persons who passed 


ILLUS 39 PER CENTS OF AGE GROUPS PASSING ITEMS 
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(After Burt, 1922, Table III.) 


various items. From this data a growth curve for each item may be 
prepared (Ulus, 40) The most desirable item is one which shows a 

rapid period of gi'owth, both be- 
fore and after the age which is 
to be measured. Such items will 
distinguish between adjacent age 
groups better than items which 
show little increase during the 
ages to be compared. The items 
shown in Ulus 39 were selected 
in part because they had fairly 
similar and regular growth 
curves. Other items have been 
found (Ulus. 75, page 196) which 
do not show such similar and 
regular growth curves The shapes of such curves depend undoubt- 
edly upon both native and environmental factors. A special training 


ILLUS. 40 PER CENTS OF AGE 
GROUPS PASSING BINET ITEMS 



(After Burt, 1922, Table III ) 
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period often results iii a rapid rise in a growth curve. Training is 
therefore an important factor in determining both the difficulty of 
an item for a particular age group and the resulting MA scores for 
persons in that group. 

Other Criteria. In the 1937 Stanford Revision, the items were 
selected after six successive revisions which involved shifting items 
from one form to another, and sometimes modifying the scoring to 
make an item harder or easier. The following features were con- 
sidered to be desirable in the final selection of items (1) ease of scor- 
ing, (2) short time requirements, (3) interest to subjects, (4) balancing 
or elimination of sex differences, and (5) equal means and standard 
deviation of IQ’s at various age levels. 

In attaining these features, thirty thousand cards were used for 
the mechanical tabulation of the results of the testing of the 3,184 
native-born white persons in the standardization group. The authors 
Terman and Merrill (1937, p. 22) write: 

By means of the Hollerith sorter it was then possible to plot for each test 
the curve showing per cent of subjects passing in successive ages throughout 
the range, also the curve of per cent passing by successive intervals of com- 
posite total score on the two forms. This was done for the sexes separately 
as a basis for eliminating tests which were relatively less fair to one sex 
than the other It was possible also to compare the scores on the form given 
first with those on the form given second, to study the effect of practice, and 
to allow for it in the computation of the composite IQ*s. The correlation of 
each test with composite total score (equivalent to correlation with mental 
age) was computed separately for each test, thus providing a basis for the 
elimination of the least valid tests One important use of the Holleridi data 
was in connection with the balancing of the two scales; it was irnportant 
that at each level the two scales should be as nearly alike as possible with 
respect to tlie relative difficulty of die tests located at that level and with 
respect to dieir correlation with total score. 

These criteria of selecting items for a Binet-type test have often 
been challenged for the lack of analysis of the psychological factors 
involved Brightness and age status are probably complex resultants 
of a number of forces which may be independent. Methods of analyz- 
ing such forces are discussed in Chapter XIV. For the present, let 
us assume that items which seem most appropriate have been selected, 
and consider the next step, which is the scaling. 

Scaling of Items 

Mental age is usually defined by a score on a test which represents 
the average of a narrow age group. For instance, the average score 
of a group of children just six years old is taken as the 6-year stand- 
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ard; persons who make this score are said to*^iave a mental age of 6. 

There are two methods of changing test scores into mental age 
scores. One, called point scaling, assigns points of credit to all the 
Items in a test, sums the individual credits, and then finds the average 
scores of various age groups. Point scaling is easy to apply since it 
only requires that the items be roughly placed in order of difficulty 
so that the subject is presented with all the tasks on which he is 
likely to succeed. This method has been used by Yerkes et al. (1915), 
Baker and Leland (1935), and Kuhlmann (1939) The other method, 
called age-level scaling, assigns test items to particular age levels and 
allows a number of months of credit for each item This method, 
used by Terman, is hard to apply because it is difficult to find items 
which fit exactly into a particular age level. 

Assignment to an Age Level The process of assigning an item to 
a particular age level is rather intricate. First, the items are ad- 
ministered to a fairly large group of persons of various ages. Then 
the per cents of persons who pass the item are recorded, as in Ulus 
39. An inspection of this illustration will show approximately at 
what age 50 per cent of a group pass the item, and how well any item 
discriminates between age groups Here an item in the eleven-year 
level. Absurdities, shows at age nine, 29 per cent passing; at age ten, 
49 per cent passing, at age eleven, 70 per cent passing; and at age 
twelve, 79 per cent passing. One is justified in concluding that it does 
discriminate fairly well among nine-, ten-, and eleven-year groups, 
but not between eleven- and twelve-year groups 

Illustration 39 also shows that the eleven-year group ranged from 
58 to 70 per cent passing the five items for this age level. If all persons 
in the eleven-year group were just eleven years old, items would be 
needed at this level which were passed by just 50 per cent of the group 
in order to give the average person an MA of eleven years. This eleven- 
year-old group, however, included children from eleven years to 
eleven years, eleven months, with an average of eleven years, six 
months Hence, if the item is to represent an eleven-year level only, 
the per cent passing must be higher than 50 per cent. 

In the 1916 Terman scale, the items assigned to each age level had 
various mean per cents passing, as shown in Ulus, 41. The smaller 
per cents passing shown in the higher age levels are due to the fact 
that growth increments become smaller with age. This fact is also 
shown in Ulus. 39, where the differences between per cents passing 
in adjacent age groups are larger among the younger age groups 
than among the older. 

The average per cents shown in Ulus. 41 are central tendencies. 
Actually the items assigned to a particular age level are not all of 
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the same difficulty. Illustration 38 shows that the range was consider- 
able In year eight the range is from 55 to 90 per cent. The more 
recent revision has a smaller range of difficulty at each year level. The 
problem of reducing this range is a 
persistent one because the figures for 
one age group will often differ from 
the figures for another age group which 
has had somewhat different training. 

Nearly all the more recent revisions 
have used for standardization groups 
of children who were within one month 
of a particular birthday. This proce- 
dure results in greater accuracy in the 
location of tests at a particular age 
level, but does not eliminate the dif- 
ficulty of securing tests which were 
passed by a certain per cent of an age 
group 

When the scaling of items is at last 
completed, the final form of the test is 
ready for wider use. Its correct applica- 
tion and scoring require a great deal of 
preliminary training Persons wishing 
to become proficient examiners should 
have, in addition to a college major in 
psychology, at least a year of special 
work in observing and administering tests under careful supervision. 
Thorough interpretation of scores requires even wider experience, 
including a knowledge of social factors which may affect the results, 
and a knowledge of probable sources of error, 

INTERPRETATION OF MENTAL AGES 

Mental age scores from Binet-type scales have the advantages of 
being widely referred to, fairly easily calculated, and in some respects 
easily understood. Since one often gets useful information from these 
scales, he should be aware of their main limitations Some of the 
sources of error in the use of mental ages are discussed here. 

Inequality of Steps 

Nearly all studies of mental growth show that increments of 
growth decrease from year to year. For most of the available scales 
the observed differences between mental ages two and three years 


ILLUS 41. AVERAGE PER 
CENTS OF AN AGE GROUP 
PASSING THE 1916 STAN- 
FORD-BINET TESTS FOR 
THAT AGE 


Age 

Per Cent 

3 

77 0 

4 

77 0 

5 

71.3 

6 

70 8 

7 

68 0 

8 

63 2 

9 

62 3 

10 

645 

12 

62 4 

14 

55.6 

Average Adult 

59 8 

Superior Adult 

*37.4 


* Per cent of Average Adult 
sample. 

(After Terman et a/, 1917, 
p 158 By permission of 
the Editor, Educational Psy- 
chology Monographs) 
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are much more noticeable than the differences between mental ages 
nine and ten years. Many of the mental age scales ignore this im- 
portant fact, and their use leads to conclusions that a mental age of 
eight years represents twice the ability of a mental age of four, or 
that the year*s growth from one to two years is equal to that from 
eleven to twelve years. Such statements, although common, are of 
doubtful value since they require special interpretations. 

Changes in Meaning of Mental Ages above Adult Level 

It is altogether impossible to measure persons in the upper half 
of the adult population on ordinary mental age scales, for they sur- 
pass the average adult's score, and a mental age was originally de- 
fined as the average score made by an age group In order to overcome 
this difficulty, Terman (1916) and later several others have arbitrarily 
extended mental age scales above the average adult level Thus, a 
mental age of 20 is not the score of an average twenty-year-old, but 
of a person who is generally in the highest 5 per cent of the twenty- 
year-olds. Mental age in this case loses its original meaning of repre- 
senting the average of a particular age group, and becomes an ar- 
bitrary point fixed above the average m such a way that the IQ's 
of adults will be distributed according to some hypothesis. 

In assigning arbitrary MA's above the average adult, Terman and 
Merrill (1937) assumed that adults should have the same distribu- 
tion of IQ's as the children who can be measured by using true MA 
scores, that is, children from approximately five to ten years of age. 
These age groups were found to have fairly normal distributions of 
IQ’s wiA the mean near 100 and the standard deviation approxi- 
mately 17. The scaling method used for adults was as follows: tests 
which were passed by less than half of the adults tested were arranged 
in order of difficulty as shown by the percentage passing. They were 
tlien assigned places in various adult levels, and given a certain num- 
ber of months-of-growth credit so that the adult sample would have 
an average IQ near 100 and a standard deviation of 17 This process 
resulted in a Superior Adult Level I, which corresponds to a mental 
age of seventeen years, four months, with six tests (Ulus. 36), each of 
which IS assigned a mental age credit of 4 months. The Superior 
Adult Level II contains six tests, the successful completion of each of 
which is given credit of 5 months. The Superior Adult Level III, 
which corresponds to a mental age of twenty-two years, ten months, 
has six harder tests in it, each of which is assigned credit of 6 months. 

These changes in the meaning of mental age above the average 
adult level are not as widely understood as they should be, and this 
misunderstanding has led to many errors in interpreting results. 
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Moreover, the methods of assigning arbitrary mental age values are 
not well agreed upon, so that various authors have devised difEerent 
standards which are not interchangeable 

Processes Measured at Different Ages 

An inspection of all the Binet-type scales shows a distinct tend- 
ency away from the use of pictures or objects and toward more ab- 
stract verbal skills with increasing MA levels. In order to allow cor- 
rect interpretation, more research is needed to show the amount of 
similarity or dissimilarity of processes sampled at various age levels. 

Components of Equal Mental Ages 

The question has often been raised whether a mental age of 10 in 
a ten-year-old represents the same abilities as a mental age of 10 
in a fifteen-year-old The answer has been given by elaborate studies 
comparing the scores of retarded and normal children of the same 
MA's (Ulus 38B). This illustration shows that Items VI-1 and VI-6 
are more difficult for retarded than lor normal or superior children, 
whereas VI-2, VI-3, and VI-6 are much easier for the retarded than 
for the superior. Burt (1922) and Merrill (1924), summarizing the 
work of others as well as their own, state that there is a marked 
tendency for older persons in an MA group to succeed better than 
the younger on items which depend to some extent on muscular 
maturation or rote memory. Verbal discriminations, unusual inter- 
pretations, and number relations are, however, relatively easier for 
the bright than for the dull These results indicate that there may be 
several fairly independent variables in the test situation, such as 
rote memory for familiar facts, observation and comparison activi- 
ties, and number combinations 

Thompson and Magaret (1947) compared responses of 441 defec- 
tives on the Stanford-Binet Form L with the percentages of responses 
of the standardizing group. Thirty items were found which showed 
significant differences between groups of similar mental ages The 
defectives were superior on 11 items and inferior on 19. The results 
support the following three hypotheses 

1. Items dependent upon practical experience are about as diffi- 
cult for normals as for the defectives 

2 Items where "rigidity” is a handicap are equally difficult for 
normals and defectives. 

3. Items which were more heavily loaded with McNemar’s general 
factor for the Stanford-Binet are easier lor the normals than for the 
retarded. 

This evidence is confirmed by other observers. A mental age of ten 
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in the case of the retarded person usually represents a slow rote 
performance, whereas a mental age of ten in the case of a superior 
person represents success in quick observations and inferences. The 
speed factor, which seems important to many in evaluating superior 
mental ability, does not count much on the Stanford-Binet or Belle- 
vue scales, but it is stressed in the higher levels of Kuhlmann's Re- 
vision. Since MA's and IQ*s are not designed for analytical work, 
they do not demonstrate these qualitative differences between bright 
and dull children. Variations in the meaning of MA’s can be over- 
come only by designing tests in which a certain score will always 
represent a particular pattern of skill. 

INTERPRETATION OF INTELLIGENCE 
QUOTIENTS 

Because the IQ is so widely used, its main limitations need to be 
examined closely. These may be considered under the following five 
headings. 

Similarity of Average IQ’s 

One mterion of comparability is seen in the average scores of 
various age groups. If average scores are all 100 or nearly 100, then 
one can be sure that persons with IQ*s of 100 all stand at the middle 
or 50th centile of their age groups On the 1937 revision of the 
Stanford-Binet Scale, the average IQ’s of various age groups used for 
standardization actually ranged from 100 to 109 This means that 
an IQ of 100 represents the 50th centile in one group and approxi- 
mately the 30th centile in another, since the standard deviations of 
these groups were all approximately 17 points. The authors found, 
however, that smoothed curves of average IQ’s showed less variation, 
and it is usually the case that larger populations give more constant 
average scores Hence these variations in mean IQ’s are only con- 
sidered to be a serious source of error in careful studies of growth. 

Similarity of Standard Deviations of IQ’s 

Frequency distribution of IQ’s of various age groups have usually 
given curves which appear to be nearly normal in shape, with the 
mean at approximately 100 and the standard deviation about 17. 
Illustration 42 (A and B) contains distributions of IQ’s and two com- 
monly used classifications of brightness 

There may be, however, differences in the dispersions of IQ’s for 
various age groups. Kuhlmann (1939) reported larger SB’s in pre- 
school groups than among the older groups Terman and Merrill 
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ILLUS 42A DISTRIBUTIONS OF COMPOSITE L~M IQ*S OF THE 
STANDARDIZATION GROUP 


IQ 


Classifications Per CerU 


160-169 

150-159 

140-149 

130-139 

120-129 

110-119 

100-109 

90-99 

80-89 

70-79 

<50-69 

50-59 

40-49 

30-39 


very superior 

superior 
high average 
normal or average 

low average 
borderlme defective 

mentally defective 


! 0 03 
02 
1 1 
31 
82 
{181 
f23 5 
\23 0 
{145 
56 
'20 
04 
02 
003 


Note The normal group in A is more inclusive than in B. Merrill pomted out 
that the distribution of scores in B was somewhat skewed toward the higher end 
of the scale 


(Merrill, 1938 By permission of the Editor, Journal of Educational 
Psychology ) 


ILLUS 42B. WECHSLER’S STATISTICAL BASIS OF INTELLIGENCE 
CLASSIFICATIONS (THEORETICAL) ♦ 


Classtficatton 

Limits in Terms of PE 

7Q Limits 

Per Cent 
Included 

Centile 

Defective 

— ^3 and below 

65 and below 

215 

0 00 

Borderline 

—2 to —3 

66-79 

6 72 

216 

Dull Normal 

—1 to —2 

80-90 

16 13 

8 87 

Average 

—1 to +1 

91-110 

50 00 

25 

Bright Normal 

-f-l to -|-2 

111-119 

16 13 

75 

Superior 

+2 to +3 

120-127 

6 72 

91 14 

Very Superior 

+3 and over 

128 and over 

215 

97 86 


(After Wechsler 1944, p. 40 By permission of the author and 
Williams & Wilkins Company ) 


(1937) attempted to keep dispersions constant at approximately 17 
points by selection of items, but variations still appear. They point 
out (p 40): 

Attention, however, should be drawn to ages six and twelve, where the 
relatively low and high values respectively are deviations too extreme to 
be explained as purely chance fluctuations. The high variability at age 
twelve might conceivably be ascribed to the differential age of the onset 
of pubescence, although it has yet to be demonstrated that pubescence 
is significantly related to the rate of menal growth Whether the atypical 
IQ variability at age six resides in the character of the sampling at that 
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age, or whether it is perhaps an artifact of the nature of the scale at that 
levercarjnot be determined from the available data. In the lack of positive 
proof to the contrary, we are probably justified in assuming that the true 
variability is approximately constant from age to age Repeated tests of the 
same subjects from early childhood to maturity will be necessary to determine 
whether this assumption is in accord with the facts 

McNemar (1942) has emphatically pointed out, however, that the 
normal shape of die usually found IQ distributions does not mean 
that intelligence is normally distributed in the groups tested The 
shape of the distribution is dependent partly upon the units of 
measurement, and upon the accuracy with which a trait is sampled, 
neither of which can be directly observed. 

Calculation for Adults 

When an attempt is made to calculate IQ's for adults, a serious 
difficulty is met, because mental age gradually stops increasing dur- 
ing adolescence, while chronological age, of course, increases con- 
tinuously The difficulty is illustrated by the case of a person who had 
an MA of 16 when he was sixteen years old, and hence an IQ of 100. 
The same person might still have an MA of 16 at the age of twenty- 
four, which, if IQ's were calculated as before, would result in an 
IQ of sixty-seven. Such a change in IQ is undesirable and with the 
usual interpretation of IQ's it would be ridiculous. In order to have 
adults' IQ's remain constant, Terman decided to let chronological 
age remain constant from the time when mental age seemed to cease 
increasing in normal individuals. 

A good deal of work has been done to find the average age of reach- 
ing maturity of intelligence or general ability, but no definite point 
has been found because growth ceases gradually, and it is difficult 
to secure good samples of the whole adult population. Moreover, 
scores in some skill and information tests continue to increase after 
others have ceased. For these reasons various authors differ in fixing 
the age of maximum mental growth. The results of measuring white 
soldiers in the United States Army showed that the average recruit 
made approximately the same Stanford-Binet score as a school young- 
ster of thirteen years and nine months of age. Terman (1916) placed 
mental maturity at sixteen years, Pintner (1931) at fourteen years. 
Baker (1935) at fifteen years and eight months, Kuhlmann (1922) at 
eighteen years, Terman and Merrill at sixteen years, Kuhlmann 
(1939) at sixteen years, and Wechsler (1944) at twenty years. These dis- 
crepancies may be overlooked m rough comparisons, but they often 
make it inadvisable to compare scores from one scale with those of 
another without adjustment. It seems probable that there will al- 
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ways be discrepancies in estimation of the average age of reaching 
maturity until independent components of the skills which are to be 
tested are defined more accurately and measured separately. 

Older Persons Given Higher IQ*s 

Wechsler has found, as was previously demonstrated by several 
others, that mean test scores declined for age groups after twenty 
years. However, he believes that the IQ should not decline, even 
though a person’s absolute ability or speed has decreased. Therefore, 
he has published tables in which the same score yields a higher IQ 
as age increases For instance, a total score of 98 will give an IQ of 
100 at twenty years of age, of 102 at twenty-five years, 104 at thirty 
years, 106 at thirty-five years, and reaches 114 at fifty-five years. 
Wechsler's IQ’s are consistent with his definition of an IQ — that it 
corresponds to a definite centile of a group. He also gives an efficiency 
quotient, which is simply the IQ taken from his ‘’Table for Adults,- 
20 to 24 years ” This age group is regarded as the most efficient on 
the scale. Once more we are made aware that good test interpreta* 
tion requires considerable study of test norms. 

Constancy of the IQ 

Great importance has been attached to constancy of the IQ of a 
person who is tested at various ages for two reasons. One is found in 
the hypothesis that an IQ represents a constant native ability to 
develop. None of the leaders in the field of measurement subscribes 
to this hypothesis without reservations, but it has often been assumed 
to be true by others The constancy of obtained IQ's is mistaken for 
evidence that the IQ’s indicate native capacity. This conclusion is 
justified only if the environment has been held rigidly constant. In 
normal society enormous differences in motivation or opportunity 
to develop are sometimes apparent even in the same family. Thus a 
constantly low IQ in a child may sometimes mean a continuously 
poor environment 

The other reason for attaching great importance to the finding 
of fairly constant IQ’s is the need for accurate predictions In order 
to allow accurate predictions, an individual’s IQ must either be 
nearly the same from year to year or vary in regular fashion. For 
example, if a girl’s IQ were 140 at the age of eight, 96 at the age of 
ten, and 115 at the age of twelve, one could not predict from any 
test what she would rate on the next test. 

Although a large number of studies have been made to determine 
how constant IQ’s actually are in various groups of persons over 
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various time intervals, these studies are as yet inconclusive, because 
of the difficulties of accurate measurement which have been dis- 
cussed and because of the difficulties of studying the same children 
over long periods of time in a constant environment. A good deal of 
interesting work in determining the constancy of the IQ has been 
done, however, the results of which will be discussed under changes 
in the size of individual IQ*s, and under correlations between tests. 
Changes in IQs are of practical significance because they show what 
estimates are actually available for individual predictions. Correla- 
tions between tests are theoretically more significant, however, be- 
cause they are not affected by certain types of scaling errors. For 
instance, a very high correlation between two tests may be found 
when the IQ’s of a group have all gone up or down or when they 
have a greater or smaller dispersion on the second test. 

Changes in Size of /Q A number of studies have been made of 
changes in IQ’s over periods of less than 3 months When the environ- 
ment was not markedly changed, most of these studies show an 
average improvement of about 5 points in IQ from the first to the 
second test. One of the most complete and interesting studies is that 
of Terman and Merrill (1937), who reported that all their correla- 
tions between IQ’s on the two forms, L and M, were distinctly fan- 
shaped, with the larger variations occurring among the higher IQ’s. 
From combined results of ages from three to eighteen years, the 
average differences between IQ’s on the two forms ranged from 2.49 
for persons below 70 IQ, to 5 92 for persons above 130 IQ The 1,291 
persons whose IQ’s were from 90 to 109 showed an average differ- 
ence of 5.09 points, which means that approximately one half of 
the sample group varied by more than 5 points and the other half 
by less than 5 points on a retest given a few days later The mean 
variation of the superior groups was slightly more, and of the infe- 
rior somewhat less than 5 points. 

Another interesting report is that of Psyche Cattell (1931), who 
obtained retests on the 1916 Stanford-Binet test at intervals of from 
zero to seventy-two months. Illustration 43 summarizes her findings 
in part. The striking thing is the loss of IQ in the children who 
were below average and the gain in those above average This tend- 
ency is very marked in the longer intervals No simple explanation 
of this was given. It may be that the retarded reach their maturity 
earlier than the accelerated, or the results may be due, in part at 
least, to difficulties in scaling items, particularly at the upper end of 
the scale. 

Similar findings are summarized by Conrad, Freeman, and Jones 
(1944), who reviewed research done over a period of 20 years 
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ILLUS 43 MEDIAN IQ CHANGES OVER VARIOUS PERIODS 
AFTER CATTELL, 1931 

MONTHS BETWEEN MONTHS BETWEEN 




TESTS 


TESTS 



0to24 


36 to 72 

IQ 

N 

Points Changed 

N 

Points Changed 

below 70 

11 

+ 40 

IS 

- 70 

70-79 

41 

-08 

38 

-35 

80-89 

75 

+ 08 

110 

-33 

90-99 

116 

-06 

193 

-22 

100-109 

127 

+ 11 

166 

+ 17 

110-119 

138 

+ 11 

101 

+ 14 

120-129 

66 

+ 08 

35 

+ 48 

130-139 

26 

+ 15 

13 

+ 160 

140 -f 

18 

+ 85 

0 

— 


(After Gattell, 1931, p 547, Stanford-Bmet Test By permission of the Editor, 
journal of Educational Psychology ) 


Correlations between Tests, Numerous studies have been made 
of correlations between tests and retests In general, the correlations 
over longer periods are smaller than the correlations over shorter 
periods. 

Terman and Merrill (1937) reported that correlations between 
Forms L and M, administered within a few days, have a median of 
.88 for ages two to six years, and of .93 for ages above six years. The 
members of the twenty-one age groups were all within 4 weeks of a 
birthday or half-birthday. The spread of scores and the consequent 
correlations are therefore smaller than would be found in age groups 
with ranges larger than 8 weeks The authors also calculated corre- 
lations (p 46) for IQ’s from Forms L and M for five IQ groups in 
single age groups 


130 and over 

898 

111-129 

.912 

90-109 

924 

70-89 

945 

Below 70 

982 


The lower IQ’s appear to be more stable than the higher, but all are 
highly predictable Correlations usually drop to about 70 when the 
first test is given to a subject who is more than five years of age and 
the retest is given after an interval of 3 years For earlier ages or 
greater periods between tests the correlations are lower 

Follow-up studies of a group of a thousand gifted children were 
reported by Terman and Oden (1947). At the time of the original 
tests in the early 1920s the group had Stanford-Binet IQ’s of 135 or 
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more. The average IQ was about 160. About 20 years later 950 
individuals, who averaged 152 in IQ on the original test, were re- 
tested using a difficult synonym-antonym and analogies test called 
the Concept Mastery Test, The average age at the time of retest 
was about thirty years. The results were not easy to interpret since 
no large random sample of adults had been tested with the Concept 
Mastery Test. From results of testing college students with both the 
Concept Mastery Test and the ACE Psychological Test or the Thorn- 
dike CAVD Tests, it seemed probable, however, that the gifted adult 
group scores in the Concept Mastery Test were about 2.1 standard 
deviations above an estimate mean for adults in general, or an aver- 
age IQ of 134. This apparent regression from an average IQ of 152 
to 134 (18 points) may be accounted for in part by the unreliability 
of the two tests, and the fact that the two tests probably do not 
measure the same types of skills. The authors estimate that about 
9 points or one half the amount of regression is due to test unreliabil- 
ity, about 4 points to difiEerences in functions measured, and about 5 
points to maturational and environmental changes About 6.8 per 
cent of the group earned Concept Mastery scores below the average 
of college students There was evidence that some of these scores 
were not representative of true ability because they were made by 
persons who had graduated with honors from leading universities 
and were outstanding lawyers, physicians, or engineers. 

R. L. Thorndike (1948) reported a similar study based on a steeply 
graded power test of vocabulary, used on a large adult voting sample, 
and the Concept Mastery Test. He estimated the average score of 
those retested by Terman to be about 1 73 standard deviations 
above the voting adult average. If the original childhood average 
IQ is taken to be 3 00 standard deviations above the mean, the gifted 
group had regressed about 40 per cent of their original position. The 
adults who were so outstanding as children now spread out over the 
highest quarter of adult voters, but about half of them fall in the 
highest 5 per cent. In childhood they all fall in the highest 1 per 
cent 


PERFORMANCE TESTS 

Most of the tests described thus far in this chapter purport to 
measure general intelligence by evaluating oral or motor responses 
to a variety of oral requests. Inspections and analyses of results have 
shown that such tests tend to be measures of verbal comprehension 
and expression to a large degree. The difficulty of applying language 
tests of intelligence to various groups was apparent at an early date, 
Dij0Eerences in language training, speech ability, hearing, and spoken 
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language were so commonly found that nonlanguage tests of intel- 
ligence were among the earliest to be designed and standardized. 
Seguin (1846) deyised a form board which was essentially the same 
as that now used in several scales (Ulus. 44). In 191 1 Witmer described 

ILLUS. 44. MATERIALS FOR THE PINTNER-PATERSON SCALE 
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a cylinder test and a revision of the Seguin Form 'Board Knox (1914) 
designed a senes of tests for estimating mental defects among im- 
migrants at Ellis Island Healy and Fernald (1911) produced a series 
of performance tests, and Pintner and Paterson (1915) adapted a 
Binet Scale to the deaf. Kelley (1916) devised a construction test, 
and Dearborn et al. (1916) produced a series of form-board and con- 
struction tests An elaborate set of form boards was constructed by 
Ferguson (1920). 

There is no clear dividing line between performance tests which 
use various objecrts and those which use paper and pencil Nearly all 
performance scales include some paper-and-pencil tests of spatial re- 
lations. 

Pintner-Paterson Scale 

In 1917 Pintner and Paterson published a standardization of fifteen 
performance tests. Since they were used in the Army testing program 
and have since been widely applied or adapted, they will be described: 

1 The Mare and Foal Test (from Healy) is a picture of a farmyard from 
which eleven pieces were cut The child is presented with the material ar- 
ranged as shown m Ulus 44, 27154, and told, “Put these pieces in the right 
places as soon as you can.*' A record is kept of the time in seconds, within 
a limit of 5 minutes, and of the number of errors. This same scoring is 
used in the next ten tests 

2 The Seguin Form Board (Illus 44, 27159) is a large board, 20 x 14% 
inches, from which ten geometrical shapes are cut (Illus 45) The direc- 
tions and scoring are similar to those in the preceding test Tliree trials are 
given, one right after the other, and results of the best trial are used as the 
final scores. 

3 Five-Figure Board (devised by Paterson, Illus 44) is similar to the 
Seguin Board, but more complex in that eleven pieces must be fitted into 
five holes. One trial is allowed. 

4, The Two-Figure Foim Board (devised by Pintner, Illus 46) is similar 
to the Five-Figure Board, but slightly easier 

5, The Casuist Form Boatd (from Knox, Illus 47) consists of twelve pieces 
to be fitted into four holes This is considerably harder than the other form 
boards 

6, 7, 8 These are smaller boards in which four or five pieces are to be 
fitted into one or two holes. 

9 The Manikin Test (Pintner) consists of a small wooden doll cut in 
six pieces which are laid out (Illus 48) The main difficulty is encountered in 
making the arms and legs fit exactly, for the places where they fit into the 
body are not the same shape This test is for five-year-olds. 

10 Feature Profile Test (Knox, Illus 49) consists of eight pieces which 
are to be assembled to form an ear and the profile of a face 
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ILLUS* 45, SEGUIN BORM BOARD ILLUS. 46, TWO-FIGURE FORM 
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Posed by Frank P. Greene 

Posed by Mungo Miller 

(Manufactured by the C. H. Stoelting Co„ Chicago, 111.) 


ILLUS. 47. CASUIST FORM BOARD ILLUS. 48. THE MANIKIN 

TEST 
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ILLUS. 49* THE FEATURE 
PROFILE TEST 


ILLUS. SO. THE HEALY PICTURE * 
COMPLETION TEST. FORM I 



Posed by Donald Johnson * 


(Manufactured by the C. H. Stoelting Co.. Chicago, 111.) 


11. The Ship Test (after Glueck, Ulus. 44) is a picture of a steamship cut 
into ten rectangular pieces of equal size. 

12* The Picture Completion Test (Healy) consists of a large picture of 
a rural scene, or of several rural scenes put together. Ten small squares are 
cut out, as shown in Ulus. 50. These are to be filled by selecting the most 
suitable pictures from among forty-eight squares. The score 'is tlie number' 
of blanks correctly filled in 10 minutes. Actually, 5 minutes is usually ample' 
time, so the test has been standardized with a 5-minute time limit. 

13. The Substitution (Woodworth and Wells) is a paper-and*pencill 
test in which rows of five sorts of geometric figures are to be marked with ^ 
numbers according to the key at the top of tire page. The score is the time I 
needed to finish fifty figures (Ulus. 51). 

14. The Adaptation Board (Goddard) consists^p# a board wi^ four round \ 

three of them 6.8 cm. in diameter and the 7^ cm. The chW 

s&biiftm flow one block fits exactly into the large hole, and then he is asked ' 
to *Tut i't into the right hole'' when the board is placed in four different 
positions^ The score is the nhmber of correct trials (see Illus. 52). . . 

15. The Cube (Ulus. 53) consists of five black, one-Jiii^ ' 

cubes. Four cubes are placedrabout 2 inches apart in a row on the table ih^ 
front of the examinee. With the fifth cube, the examiner taps the other 
four in a particular order at the r^ie of one tap a second- Then 
aminee is told, "Now you do what I did/- The order is simple at fir^l^^S 
later becomes more complex. The score is the number of correct 
which, of course, reflects the difficult task completed. 
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ILttJS 51 SUBSTITUTION TEST 
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(Woodworth and Wells, 1911 By permission of the Editor, 
Psychological Monographs) 


The authors prepared a modified age scale and a point scale from 
their results, including time, errors, and total accomplishment. They 
also published percentiles for each test for each age group and mental- 
age tables for each test To secure a single representative mental age, 
they suggested using a person's median mental age, and this practice 
has been widely followed. The authors did not offer a systematic 
analysis of the processes needed for success in these tests, but some 
aspects seem obvious* 

1. Twelve of the tests are speed tests where a few seconds represent the 
difference between successive age levels. 
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2. In all tests, manipulation of materials is required, and in the form 
board precise fitting movements are also required. 

3 Perception and comparison of form are essential with the form boards, 
and ability to interpret pictorial material is appraised by five of the tests 
(I, 9, 10, 11, and 12). 

4. In all the tests a systematic procedure or plan is effective m reducing 
the time needed for success 

5. Immediate memory span is stressed in the Cube Test and the Substitu- 
tion Test, 

A short form of this scale, including Tests 1, 2, 3, 4, 5, 9, 10, and 11, 
has been issued by Pintner and Hildreth (1937). 


ILLUS 52. ADAPTATION BOARD ILLUS 53. THE KNOX CUBE 



(Manufactured by the C. H Stoeltiag 
Co , Chicago, 111.) 


Arthur Performance Scale 

In 1930 Arthur published a restandardization of ten of the tests 
in the Pintner-Paterson scale (1, 2, 3, 5, 7, 9, 10, 11, 12, and 15) which 
were based on scores of approximately 1,100 school children from six 
to sixteen years of age She also restandardized the Porteus Maze 
Tests and the Kohs Block Design Scale. 

The Porteus Maze Test (1924) requires one to draw with a pencil 
through printed mazes (Ulus. 54, A and B). Eleven mazes have been 
prepared for age levels from five years to superior adult The mazes 
are called roadways, and one's pencil line must not cross any printed 
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lines or go into any dosed roads Failure to observe these instructions 
leads to the immediate removal of the printed maze and a second 
trial with a duplicate form. Only two trials are allowed on nine of 
the mazes, but on the twelve- and fourteen-year levels four trials 
are given Success on the later trials counts less than success on the 
first The total credit is given in years and months of mental age. No 
time limits are set Speed is not a factor in this score, but caution is 
probably a large factor. 


ILLUS 54A PORTEUS MAZES ILLUS 54B PORTEUS 



(Manufactured by the C H Stoeltmg Co., Chicago, 111,) 


The Kohs Block Design Scale (1927) requires the examinee to 
duplicate with colored cubes the designs on seventeen printed cards. 
All the cubes are the same, each having four sides colored red, white, 
blue, and yellow respectively; and the other two sides being divided 
diagonally between blue and yellow, and red and white respectively. 
After a simple demonstration, the cards are presented with the re- 
quired number of blocks The simplest designs use only four blocks 
and the most complex, sixteen. Time is limited to short working pe- 
riods for each design, and the scores are points assigned on the basis 
of speed of completing a design. 

Arthur’s scoring ignored errors or number of moves, but allowed 
credit for either speed of performance or, in the untimed tests, 
achievement Weighting the scores for errors was not found to have 
much effect upon one’s position in a group of persons, Arthur pub- 
lished age equivalents for the scores of each test She also furnished 
point scales which assigned credit on the basis of the power of a test 
to discriminate between adjacent age groups. By this procedure the 
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relative weight allowed for a particular test may vary from year to 
year. She advised the summation of points to give a total score for the 
test. This total may be converted to a performance age (PA) by 
reference to a table. 

A number of other nonverbal scales designed to measure general 
ability are described in Chapter VIII, “Group Tests of Ability.” 

TESTS OF ABSTRACTION, 

OR CONCEPT FORMATION 

Although nearly all the tests of achievement or aptitude require 
a person to form concepts or to use concepts already formed, there 
are some tests which emphasize concept formation in their scores. 
Dr. Kurt Goldstein studied several hundred brain-injured patients 
for 10 years after World War I and described m great detail their 
inability to make or use abstractions as normal people do Partly as 
a result of his work several tests have been published which allow an 
observer to see the step-by-step formation of concepts, or the diffi- 
culties encountered. 

One of the most thorough studies of concept formation is that re- 
ported by Rapaport, Gill, and Schafer (1945) They distinguish tliree 
levels of concept formation. The %st, called concrete, is illustrated 
by a nonverbal sorting of objects which belong together because they 
are similar in some sensory percept. The second, called junctional, is 
shown by a verbal or nonverbal sorting of objects which are used 
together. The third, called abstract-conceptual, is illustrated by ver- 
bal statements indicating active induction and deduction. These 
authors point out that the answers for the Similarities Test of the 
Bellevue Scale are results of abstract-conceptual thinking, but that 
the test often fails to show the degree of deterioration because the 
patient may retain stereotypes of speech in spite of severe deteri- 
oration in concept formation. The Sorting Test of Goldstein and 
Sclieerer (1941), which consists of objects common in everyday ex- 
perience, is a better test because it reveals all three types of concept 
formation, and shows impairment at a stage much earlier than the 
Bellevue Similarities Test The Hanfmann-Kasanm Test uses geo- 
metrical forms and colors which are not commonly used in everyday 
experiences. The test makes a greater demand on ability to examine 
and group objects according to new concepts than either of the two 
just mentioned. 

The Sorting Test uses thirty-three objects which are common to 
most households, a knife, fork and spoon; a miniature knife, fork and 
spoon, a screwdriver and pliers, and a miniature screwdriver, pliers. 
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hammer and hatchet; two nails and a block of wood with a nail in 
the center of it, two corks, two sugar ciibe^, a pipe, a leal cigar, and 
cigarette, an imitation cigar and cigarette, a matchbook, a rubber 
ball, a rubber eraser, a rubber sink stoppei, a white hling card, a 
green cardboard square, a red paper ciicle, a lock and a bicycle bell 

Part I of the test consists of seven instances when a diileient ob- 
ject is selected from the group and the subject is told to find all the 
remaining objects which belong with it Aftei each grouping the 
subject IS asked, “Why do all these belong togetheP” Alter each com- 
plete inquiry, the objects are all grouped togerhci again before the 
next trial. Part II of the test consists ol twelve situ«itions wdicn the 
examiner places a group of objects before the suliject and says, ‘ Why 
do all these belong together?” The scoring includes thiee vaiiables. 

a Adequacy of sorting is the degree to which all iclevant objects 
are grouped together by some common or clearl) defined sinnlaiity 
Adequacy of verbalization is the degree to which explanations aie 
given for grouping. The sorting and the verbalization may not be in 
close agreement. 

b. Conceptual level Here concrete, functional, and abstiact 
levels, or pathological variations of these, are desciibed Thus the 
concrete level is illustrated by the explanation that the kiiiie, ioik, 
and spoon are grouped together because “You find them on the 
table.” To say, “You eat with them,” show's a lunctional level, and 
“They are silverware,” represents the abstiact level Some patholog- 
ical groupings are called by Rapaport syncretistic, fabulated, sym- 
bolic, and chain. Syncretistic definitions are so broad as to allow 
inclusion of nearly everything. “They all belong to men ” Fabulated 
definitions make one object the starting point ot a story idiich brings 
in other objects Symbolic definitions radically reinterpret the mean- 
ing of objects; a piece of paper is a room, or the laige and small 
forks are mother and daughter, Cham definitions are a series sug- 
gested by different aspects of different objects Thus a red papei circle 
is placed with a red object, then the bicycle bell is added because it 
is round, then the pliers because they are metal. Theie is little leten- 
tion of the conceptual frame of reference horn one moment to the 
next. 

c. Concept Span, This variable refers to the looseness or on the 
other hand the narrowness of grouping Loose giouping is seen w'heii 
the lump of sugar is placed with the eating uteu'jils Very loose 
grouping is seen in the case of a person who put all objects togetliei 
that had the slightest roundness. The narrow grouping is seen in 
compulsive or overmeticulous persons who suck vciy iigidly to sev- 
eral aspects of the sample and find reasons lor grouping none oi 
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only one or two objects with it. Other types o'f'narrow grouping are 
due to inertia or to the use of symbolic meanings. 

The Hanfraann-Kasanin Concept Formation Test and the Weigl 
Goldstein-Scheerer Color-Form Sorting Test each present about 
twenty blocks of diflEerent colors and shapes, and ask the patient to 
sort them into similar piles. Various procedures give clues when 
errors in grouping are made to help the subject find the “correct*' 
grouping. The number and types of groupings indicate the methods 
of thinking. Degrees of flexibility, fluidity, persistence, and rigidity 
can be observed and recorded. In general those with severe brain 
injury of the frontal lobes can only group the blocks by one aspect 
at a time, such as color or tallness When two or more aspects are 
required in grouping, such as tall and wide or tall and narrow, the 
abstraction involved is too difficult for many brain injured, even 
though many successive trials are allowed. A good deal of research 
by clinicians is now going on using material of this kind. 

PRINCIPAL USES AND NEEDED RESEARCH 

Individual intelligence scales are frequently used in schools and 
clinics. In schools they aid .n the adjustment of pupils by analyzing 
the reasons for unusual success or failure. Among slow or fast pupils 
determining the MA will often help to decide how much accelera- 
tion or retardation is desirable In clinics an MA can aid in detect- 
ing the effects of serious handicaps Special adaptations of the Binet 
and of performance tests have been made for the blind by S B. Hayes 
(1930, 1941) and for the deaf by Pintner (1931, 1945) and Hiskey 
(1941) Thus, for children who have poor hearing or vision, or 
who have speech, reading, or emotional difficulties, an MA and an IQ 
will often aid by indicating the most reasonable course of action. 

When making individual applications of any test it is necessary 
that all important aspects of the situation be considered For in- 
stance, John F who is eleven years, three months of age was brought 
to a community clinic as a candidate for a special opportunity class. 
He was consistently failing in reading and arithmetic in the fourth 
grade in a large public school He had missed four school days re- 
cently and admitted wandering about the city with another boy most 
of the time. A medical report showed negative results. He was a little 
overweight but normally active On the Stanford-Binet he earned 
an MA of ten years, two months, and an IQ of .90, and on the Wech- 
sler-Bellevue Test he earned a Verbal IQ of .81, a Performance IQ 
of .91, and a Total IQ of 86 His poorest scores were on the Arithmet- 
ical Reasoning, Digit-Span, and Digit-Symbol tests. On the latter 
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there was some question of fatigue, but on most of the tests he seemed 
well motivated He asked several times how well he was doing. On 
a Metropolitan Achievement Test he earned an average Educational 
Age of 9-3, or the equivalent of grade 3 3 (3%o grades). This yields 
an Educational Quotient of 84 (9-3/11-3). These results sho\\ed Hiat 
he was a little more retarded in school achievement than in Ins gen- 
eral development In such cases additional evaluations of social and 
emotional adjustments are, of course, necessary before a remedial 
program is advised 

Individual tests are also used to predict later successes A recent 
study by R. L. Thorndike (1947) reports the prediction, from cailicr 
Stanford-Binet scores, of scores on a difficult Verbal-Comprehension 
test given during the last year of high school. Illustration 55 shows 

ILLUS. 55 PREDICTION OF SCHOLASTIC APTITUDE VERBAT. TEST 
SCORES, FROM EARLIER STANFORD-BINET SCORES 
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(From Thorndike, 1947. By permission of the author and the editor of the 
Journal of Educational Psychology ) 

these predictions. They are large enough to be significant for groups 
but not for individual counseling. The correlations decrease Irom 
.71 for Binet Tests given in the tenth grade to .39 for Biner Tests 
given in the first grade. These correlations would be much higher il 
the group were not so highly selected. 

Wechsler-Bellevue Scale 

The uses of this scale have been reviewed by Wechsler, who Jound 
that certain mental disorders, race, age, and experience are associated 
with patterns or diagnostic profiles He specifically warns against 
misuse of these patterns, but indicates that a careful consideiation 
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of them, together with other facts, will provide a more accurate 
diagnosis and prognosis. The following are samples of four patterns 
which seem to have been found fairly frequently 

Racial groups. There are some racial differences which are in 
need of furtlier study In general, among those tested, Jews did bet- 
ter on verbal than on performance tests, and Italians did better on 
performance than on verbal tests. 

Occupations. Among adults a person’s occupation may be related 
in some way to his scores Carpenters generally scored higher on 
performance tests, and lawyers and teachers on verbal tests. 

Age. Information, Comprehension, Vocabulary, Object Assem- 
bly, and Picture Completion tests hold up with advancing age bet- 
ter than the others do. The others decline more rapidly with age. 
In general the tests which decline with age require rapid accurate 
calculations, observations, or problem solving, while those which 
hold up with age are based principally on remote memory in con- 
trast to immediate memory. The relative decline in normal adults 
(from fifty-five to fifty-nine years) is indicated by the ratio of .84 when 
the scores of don’i-hold tests are dwtded by the total score of hold 
tests. The tests that hold up well with age are also the tests which 
usually hold up best among those with mental disorders. 

Mental disorders. Wechsler points out that psychoses of every 
type, organic brain disease, and to a lesser extent most psychoneuroses 
show a much better performance in the verbal than in the nonverbal 
tests. A difference of from 8 to 10 points between verbal and non- 
verbal test totals IS within normal range, but the amount varies with 
the intelligence level of the individual Adolescent psychopaths and 
high-grade mental defectives, however, usually do better on the per- 
formance tests than on the verbal tests Their failures are due to 
lack of the required ability rather than to disorganization of the 
ability. 

Another clinical application is the measure of the spread of scores. 
Each subtest is equated to the others by means of a point scale, so 
that they all have a mean of 10 and a standard deviation of 3. Hence, 
if a subject's total score is 95, the expectancy for each subtest is 9 5, 
since there are ten subtests. To be significant, the amount by which' 
the tests must differ from the mean is roughly one fourth of the mean 
subtest scores Wechsler has adopted a practical method of sum- 
marizing deviations for persons with IQ's between 80 and 110, by 
using a plus for deviations from 1.5 to 2.5 above the mean and a 
minus for a similar deviation below the mean Two pluses show a 
deviation of 3 or more above the mean, and two minuses show the 
same deviation below the mean. Illustration 56 shows a typical pat- 
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ern for organic brain diseases Wechsler also gives typical profiles 
or schizophrenics, neurotics, adolescent psychopaths, and mental de- 
ectives. 

ILLUS 56 ORGANIC BRAIN DISEASE PATTERN 

aformation 14 + Case 0 = 1 Male, age thirty-four, showing 

lomprehension 12 + definite neurological signs including marked 

Tithmetic . 9 hydiocephalus, facial weakness, slight tremor, 

)igits , 13 + absent abdominals Also suggested Babinsky 

imilarities 11 on left side with mild postural deviations on 

Verbal Total 59 same side Diagnosis post-meningoencepha- 

htic syndrome At age six months patient had 
icture Arrangement 9 an injury with sequelae lasting six months, 

icture Composition 8 — which was diagnosed as meningitis This case 

dock Design 4 shows the four most conspicuous signs of or- 

)bject Assembly 1 ganic brain disease large discrepancy between 

hgit-Symbol 3 Veibal and Perfoimance in favor of the for- 

Performance Total 25 mer, very low Blocks combined with even 

lower Object Assembly and very low Digit- 
"otal IQ . . . 95 Symbol While all of the test scoies on the 

verbal part of the examination are average or 
Verbal IQ 115 above, the two lowest are Similarities and 

Performance IQ 74 Arithmetic, which are in line with the organic 

picture The only exception is the Digit Span 
which is good, both forward (8) and backward 
( 6 ). 

(After Wechsler, 1944, p 161 By permission of the author and 
Williams & Wilkins Company) 

The clinical use of the Bellevue Scale has been reported on ex- 
snsively by Rapaport (1945). Rabin (1945) and Watson (1946) have 
sviewed in fifty-one technical articles the clinical and other uses 
aat have been reported Rabin concludes that the Bellevue verbal 
:ale correlates more Iiighly than the full scale or the performance 
:ale with most other intelligence tests. The verbal scale compares 
rell with other tests in predicting academic success, but the per- 
Drmance scale is practically useless for such prediction. Rabin also 
iels that the measures of scatter or intrapersonal patterns have sue- 
seded in differentiating some groups but not individuals, and that 
lere is still insufficient agreement among group differentiations be- 
luse of failures to control or allow for differences in age, race, school- 
ig, intellectual level, and cultural factors. Long-range retest studies 
re still rare and much needed 

The use of deterioration indices is still in an experimental stage 
.eports by Magaret and Simpson (1948) and by Garfield (1948) both 
idicate that*for groups of fifty and one hundred mental hospital 
atients the Wechsler-Bellevue index of deterioration and the Ship- 
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ley-Hartford Conceptual Quotient (CQ) showed correlations that 
were not significantly different from zero Likewise, neither index 
was significantly correlated with psychiatrist’s ratings of deteriora- 
tion, nor with subsequent declines in total scores over a period of 1 1 
months. These findings do not mean that these evaluations are with- 
out merit Rather, they mean that more research is needed to define 
and appraise more accurately the phenomena under consideration. 

Problems for Research 

The Bellevue Scale opens the way to many research activities, most 
of which have been pointed out by Wechsler and otiiers. Further 
study seems needed to establish the optimum length of the subtests 
for diagnostic purposes Also the question of what abilities are actu- 
ally being measured must be answered statistically sooner or later. 

Cattell (1943) in a summary of theories of intelligence has defined 
two different kinds of mental ability, fluid and crystallized. Fluid 
ability is a “purely general ability to discriminate and perceive rela- 
tions,” It increases until maturity, then declines slowly. It accounts 
for the intercorrelations among children’s tests of intelligence and 
among the speeded or adaptability tests of adults. Crystallized ability 
consists of memory, skills, and discriminatory habits established in 
a particular field These habits were originally established through 
the operation of fluid ability, but no longer require insightful per- * 
ception to a high degree. At all ages intelligence tests combine both 
fluid and crystallized ability, but in childhood fluid ability is nor- 
mally predominant, while among adults the performance is more 
determined by crystallized abilities. For more thorough discussions 
of some of the problems in defining and measuring intelligence, one 
should consult Stoddard’s Meaning of Intelligence (1943) 

Another important field of research lies m determining the effects 
of environment upon measures of ability. Several authors have pro- 
duced tests which they hoped would be relatively free from cultural 
influences, but little evidence of the value of these tests has come 
to hand. Allison Davis (1948) has pointed out that some of the 
Stanford-Binet test items seem to have a socio-economic bias. He 
changed certain items to eliminate what he thought were cultural 
loadings of content, in such a manner that the essential problem ap- 
peared to be unchanged His results showed much smaller differences 
between persons in different socio-economic groups than were found 
when using the standard tests His findings point to the possible in- 
justice of using one test for many different groups, and the need for 
careful research to correct this situation. 

A rough inspection of Wechsler’s correlation matrices indicates 
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a rather predominant verbal factor with small and unknown amounts 
of number, perception of form, spatial thinking, and dexterity. Rea- 
soning is probably significant, especially in the tests which do not 
hold up well with age. A thoroughgoing factorial analysis followed 
by the development of unique measures is desirable 

Another research problem involves the method of combining and 
relating subtests Although Wechsler writes that intelligence is not 
a sum of abilities^ he does sum up arbitrarily weighted test scores 
in such a way that an IQ may have a great variety of qualitative varia- 
tions One can only roughly guess what is included in intelligence 
by this procedure. 

A good deal of research is also needed to show the relation between 
modes of adjustment and test scores, particularly among persons with 
mental disorders Such factors as paranoia, fear, and low energy may 
affect some test scores more than others. 

SUMMARY 

In conclusion it should be said that all of the scales discussed in 
this chapter originated as rough samplings of various types of be- 
havior. Usually the types of behavior were vaguely defined by the 
authors of the tests, who either chose tests which distinguished the 
bright from the dull in school or in other situations, or chose tests 
for verbal or nonverbal characteristics, or for some other psycho- 
logical pattern, such as reasoning or problem solving. Practically 
all these tests were in approximately their present form twenty or 
more years ago, and few of them have yet been submitted to careful 
studies of sampling of components, scaling, and analysis of the emo- 
tional variables in the test situation. There is great need for re- 
search in the development of individual verbal and individual per- 
formance scales, which will clearly evaluate some well-defined pat- 
terns of behavior In order to be most useful a score must always 
represent the same qualitative and quantitative behavior pattern. 
Chapters VIII and XIV go much further into the problem of de- 
fining and measuring unique ability. 

STUDY GUIDE QUESTIONS 

1 What guided Binet in his selection of items^ 

2 What evidence is there that Binet's 1908 scale was not adequately 
standardized? 

3 What evidence is there that bright and dull pupils were better 
selected by tests calling for only one skill than by tests calling for various 
skills? How IS this evidence explained? 
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4. What use of speed, number, verbal comprehension, and reasoning 
was made m the 1937 Stanford-Binet tests^ 

5 In what respects does the Kuhlmann-Binet test differ most from the 
Stanford-Binet? 

6 What age ranges are covered by the Stanford-Binet, Kuhlmann-Binet, 
and Wechsler-Bellevue tests? 

7. How does the Wechsler-Bellevue test differ in content, arrangement, 
and scoring from the Stanford-Binet? 

8 How does Terman define and determine the MA and the IQ of an 
adult? 

9 How does Wechsler define and determine the MA and the IQ of an 
adult^ 

10 Indicate the relative merits of the Terman and the Wechsler methods. 

1 1 How did Terman and Merrill select items to be included? 

12 What evidence is there that a mental age of 8 represents different 
skills in retarded, normal, and superior children? 

13 What are the usual per cents of mentally defective, average, and 
superior? 

14 What variations are expected in retests on the Stanford-Binet within 
a few days? Within a year? 

15 What evidence is there that Stanford-Binet scores at six years predict 
verbal intelligence at eighteen years? 

16 What trends have been found among dull and bright children's 
IQs when measured from childhood through adolescence? 

17. What analyses of Wechsler-Bellevue results are of value in clinical 
diagnoses? 

18. What types of tests are included in the Pmtner-Paterson Performance 
Scale? 

19 What are the usual reliabilities of verbal and nonverbal intelligence 
tests, and what are the intercorrelations of these tests? 



CHAPTER VII 


MEASURES 
OF EDUCATIONAL 
ACHIEVEMENT 




This chapter deals with the types of instruments available for meas- 
uring the results of formal instruction, including language, number 
skills, and special studies. Nation-wide testing programs are also de- 
scribed and practical applications and correlations between tests 
are summarized. 

CARDINAL OBJECTIVES OF EDUCATION 

During the last thirty years public and private schools have under- 
gone important changes. In many places the age of entering school 
has been lowered by the establishment of nursery schools, and the 
age of leaving school has been advanced two years — ^from sixteen 
to eighteen years The value of practical applications of study has 
been demonstrated in the design of texts and in the changes in many 
courses In order to bring about the best social development, the 
practice of promoting almost all pupils regularly has become wide- 
spread. 

Objectives and methods of instruction have been scrutinized and 
redefined by educational leaders and classroom teachers The ideals 
of the Progressive Education Association have been recorded by 
Smith and Tyler in Appraising and Recording Student Progress 
(1942). The National Society for the Study of Education has recently 

156 




156 ACHIEVEMENT AND APTITUDE 

issued several yearbooks bearing on the appraisal of objectives. Espe- 
cially noteworthy is the Forty-Fifth Yearbook, Brownell, The Meas- 
urement of Unde't standing (1946) The philosophy of The National 
Vocational Guidance Association has been voiced by Layton (1948). 
The National Education Association has published a number of 
volumes on objectives, among which are Lorge, Methods of Re- 
search and Appiaisal in Education (1945); Conrad, Psychological 
Tests and Their Uses (1947), and Margaret E. Bennett, Counseling, 
Guidance, and Personnel Work (1945). 

The following seven goals are found in almost all of the recent 
publications. 

1. Baste information. This includes language, form, and num- 
ber knowledge at all grades These are discussed in this chapter. 

2. Skill tn thinking Selection of evidence, drawing inferences, 
making practical applications. (See Chapters VII through XI ) 

3. Discoae'iy and development of an tndividuaVs highest aptitudes. 
Basic aptitudes are discussed in Chapter VIII, and artistic aptitudes 
in Chapter X. Special knowledge and skill tests in social sciences, 
physical sciences, foreign languages, and business studies are de- 
scribed below. 

4. Discovery and remedial treatment of poor social and emotional 
adjustments (See Chapter XXII, “Personality,” Chapter XXIII, 
“Rorschach Methods,” and Chapter XXIV, “Observations of Be- 
havior.”) 

5. Development of a sensible vocational goal. (See Chapter XX, 
“Interests.”) 

6. Development of interest in good civic, social, and artistic activi- 
ties. (See Chapter XXI, “Appraisals of Attitudes.”) 

7. Good physical health 

The above goals show that education now, as ever, includes much 
more than the traditional reading, writing, and arithmetic. Those 
chosen for discussion have been selected on the basis of wide usage, 
reliability, and interesting d.agnostic possibilities. 

BATTERIES OF ACHIEVEMENT TESTS 

Because many school systems have similar courses of study many 
publishers of achievement tests have developed batteries of tests 
(Appendix II). In order to avoid coaching or practice effects, most 
of these batteries are issued in more than one form. For many years 
the Cooperative Test Service and the College Entrance Board have 
issued annual forms. The other publishers issue from two to five 
forms at each level 
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The time limits are fairly liberal in most cases, but speed of work 
is also a factor in certain tests. For the first, second, and third grades 
about 50 minutes are usually allowed, divided into two or more pe- 
riods For the fourth through the sixth grade about four 40-minute pe- 
riods are generally used For the seventh through the ninth grade six 
40-minute periods are required for complete batteries. 

A great deal of emphasis has been placed in recent years on the 
development of critical or constructive thinking. Test items to ap- 
praise the drawing of accurate inferences, however, have been com- 
mon in both science and literature tests for at least twenty years 
(Ulus 12) Probably the presence of these items in standard batteries 
has helped to emphasize this important school objective 

Approximately all of the batteries are published in separate sec- 
tions, and thus allow for the administration of short batteries These 
usually omit some of the special subjects. Illustration 57 gives a rough 
comparison of topics included in achievement and reading batteries, 
and shows that the Traxler Reading Tests, which are typical of many, 
include a rate-of-reading score, which is not found in the Metro- 
politan Achievement Tests or the Iowa Educational Development 
Test The latter includes a section on use of sources — abstracts, pe- 
riodicals, indices, and library cards, etc The Metropolitan Achieve- 
ment Tests include separate scores for computation, history, and 
geography. The Psychological Corporation Clerical Examination in- 
cludes word comparison and filing tests, which are not found in the 
others More detailed descriptions are given below. 

As yet only a few fragmentary studies of the relative merits of 
various tests are available. One should consult yearly reviews of 
tests, such as are found in Buros* Mental Measuiement Yeaibook 
and the periodic Reviews of Educational Research^ by the American 
Educational Research Association, for more detailed criticism and 
information 


MEASUREMENT OF LANGUAGE 

Language, broadly defined, is any series of oral or motor acts by 
which individuals communicate with one another. Language may be 
divided into two general classes unlearned signals for action and 
symbols which refer to some experience. Animals have signal lan- 
guages made up of movements and sounds which cause particular 
responses in other animals These are often calls which lead to pro- 
tection, flight, mating, the sharing of food, or other experiences Such 
sounds seem to be developed through maturation along with the 
acts and without any intention on the part of the animal to give 
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signals For instance, animal psychologists iisnally agree that in a 
fight a dog barks or growls, not to scaic an opponent, hut because the 
situation sets off a series of movements 'which result in balking 
Similarly, a mother hen does not cluck because she intcnch to tell the 
chicks that she has found food, bur because the situation brings out 
a clucking response. Sudi explanations ol commuiiicative behavioi 
among animals have been 'ivell suj^poired by carclul observations 
The sounds made by small inlants seem to be of this same iinpic- 
meditated variety, and among adults gesitiies and inodes of expres- 
sion often appear to be unlearned or only slightly modified by learn- 
ing Such elements of comniu meat ion aie important but so difficult 
to appraise that they are seldom rccoided m test lesults 

A symbol is defined as an act oi object which becomes a substitute 
for another. A symbol may oi may not be intentionally given Any 
symbol derives its meaning ii om the i espouses which it evokes Sounds 
and words develop their common meanings by social agi cement 
Thus, the word hot is a symbol agieed upon by a particuldi group of 
people to refer to a common pattcin of expeiience Authoiities usu- 
ally agree that written language ongiiialed partly fiom piciuies and 
partly from sound symbols. Oitain spoken 'words probabl) began as 
imitations of characteristic sounds and a& unintcnijonal signals 

Psychologically, the most significant aspect of language is probably 
not the, form, although that is important, but the piocesses of ab- 
straction and combination. The isolation of a jjarticulai aspect of a 
situation is called an abstraction When an abstraction is expciienccd 
in several situations and rcmembeied, it becomes a concept The 
simplest concepts are experiences of contrast oi similarity of size, 
shape, length, and loudness Concepts may also be \erv intricate pat- 
terns which depend upon several senses and which combine other 
concepts. Illustrations of complex concepts aie the rules of tennis 
and the meanings of the woids fedcialism or rnntmation Not all con- 
cepts are expressed or remembeicd verbally, but all woids refer to 
concepts. 

A complete list of important language factors would include the 
following somewhat independent skills and knowledges 

1. Techniques of expression speech, grammai, punctuation, sj^ll- 
ing, rhetoric, and handivriting 

2. Word knowledge in vaiious fields 

3. Reading complex skills oi peiceptioii, compiehonsion, and 
reasoning 

4. Muscular coordination, as in speech and WTiiing 

5. Visual acuity, as in reading and wanting 

6. Auditory acuity, as m speaking and listening 
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7. Knowledge of authors, publications, literary style, and form 

8. Attitudes toward language usage and literature 

Since the fii'st three items cover the principal skills of language 
achievement, they are described here in detail Items 4, 5, and 6 are 
dealt with in Chapter X. They involve elaborate techniques for 
diagnosis and remedial treatment. The last two factors, which con- 
cern literary discrimination and appreciation, are discussed in Chap- 
ters X and XXI. 

Appraisals of Expression 

Expression, either oral or written, although one of the most im- 
portant and usual aspects of a person's behavior, is difficult to evalu- 
ate. Whereas aspects of expression, such as handwriting or pronuncia- 
tion, can be mechanically recorded, their evaluation is always in the 
mind of the reader or listener. Moreover, some of the poorest expres- 
sions from a logical or a grammatical point of view have been popu- 
lar and effective politically, socially, and even artistically. 

Handwriting, Three kinds of evaluations of handwriting are 
fairly common. (1) Experts try to determine for legal evidence 
whether or not two or more samples were written by the same per- 
son. The way a person's handwriting may vary under different condi- 
tions IS investigated. (2) Graphologists try to deduce indications of 
personality traits from samples of handwriting (Chapter Xyil). (3) 
The excellence of handwriting is rated according to scaled samples 
of penmanship. 

Scales of penmanship were among the earliest to be developed. 
That of Thorndike (1910), which seems to have been the first, con- 
sisted of samples reproduced with their scale values assigned by 
the equal-appearmg-mterval technique. Ayres (1912) used a similar 
method to devise a scale which has been widely used because he pub- 
lished grade norms for both quality and rate. Illustration 58 shows 
samples from his scale for three scale values, 20, 50, and 80. The aver- 
ages for quality were found to increase from 38 for the second grade 
to 62 for the eighth grade, or 4 points for each grade. The rate in 
average words written per minute increased from 31 for the second 
grade to 79 for the eighth grade for the standard selection Koos (1918) 
found the average quality of handwriting of adults in twenty-five 
different occupations to be 49.5 on the Ayres Scale, which is about the 
fifth-grade level. 

Several check lists for a systematic recording of aspects of pen- 
manship have appeared. Freeman (1914) devised one which not only 
lists defects but also notes their most common causes. Nystronx (1930) 
designed scales for color or heaviness, size, slant, letter spacing, word 
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Spacing, beginning and ending of strokes, and alignment Remedial 
suggestions and diagrams to aid students accompanied each of the 
seven scales. 


ILLUS 58 AYRES HANDWRITING SCALE 
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About One-Third Actual Size 
(Ayres, 1912. By permission of the Russell Sage Foundation ) 

Scales of English Composition, In order to provide a more reli- 
able means of grading English composition than the judgment of a 
single teacher, Hillegas (1912) constructed a scale for use in the fourth 
through the twelfth grade. It consisted of a series of short essays ar- 
ranged in order of excellence by a group of judges and numbered to 
show standard deviation scale values and grade norms About a 
dozen similar scales have appeared, among the most analytical of 
which IS undoubtedly that of Van Wagenen (1923), who provided 
different scales for the following: 

Exposition* Sixteen short essays on the topic "How I Earned Some 
Money" 

Nan ration* Fifteen essays on "When Mother Was Away" 

Description* Sixteen essays on "It Was a Sight Worth Seeing When the 
Troops Marched Away" 

Each essay is rated for three qualities, thought content, structure, 
and mechanics, and the average rating is taken as an indication of 
general merit In order to aid the rater, his attention is called (p. 2) 
to particular aspects of a composition, thus. 

In rating for thought content in description, take into consideration: 
Maintenance of point of view (both physical and mental) 

Vividness of picture 
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Emotional reaction 
Vigor and originality of diction 
In rating for sentence and paragraph structure, take into consideration: 
Unity 
Coherence 
Emphasis 

Variety and complexity of sentences 

The writer who uses many complex sentences shows a greater maturity 
of mind than the one who uses very simple or unnecessarily compound sen- 
tences, even though, from the very fact of the greater complexity, he may 
make more actual mistakes in structure 

In rating for mechanical errors, take into consideration* 

Spelling 

Punctuation (only cases of actual error, not cases where punctuation is 
optional) 

Capital letters 

Paragraphing (only cases of actual error, not matters of preference) 
Here, too, one must take into consideration the range of vocabulary and 
complexity of expression. For instance, it is a .more fundamental error to 
misspell “receive” tlian to misspell “psychological ” Punctuation of very 
simple sentences would give less opportunity for error than that of com- 
plicated sentences or conversation. 

Samples from the Description Scale, which runs from 0 to 100, are 
shown in Ulus. 59 The scale values used by Van Wagenen were estab- 
lished by submitting the compositions to 119 experienced teachers 
who arranged them m order of excellence three times, once for 
thought content, once for structure, and once for mechanics A scale 
value of 10 was arbitrarily assigned to a difference between two items 
which caused 75 per cent of judges to rate one item better than the 
other. The whole scale extends over ten such differences. These 
samples have been chosen to show nearly the same scale values for 
thought content, structure, and mechanics, but in school situations 
a composition is often found to be high in one aspect and low in the 
others. This diagnosis allows a teacher to aid a pupil in the most 
effective manner. 

Since letter writing constitutes the major part of the written ex- 
pression of perhaps nine tenths of adults, the scales for grading the 
general excellence of correspondence prepared by Lewis (1923) are 
of considerable interest. Five separate scales of nine items each have 
been assembled for (a) letters ordering material, (5) applications for 
a position, (c) narrative social letters, (d) expository social letters, and 
(e) simple narrations. Lewis suggests that pupils be allowed to com- 
pare their own compositions with those of the scale, so that they 
may note the good points and correct their errors. Another special 
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ILLUS 59 ENGLISH COMPOSITION SCALES: DESCRIPTION 
SCALE ITEMS 

General Merit 7.3 

Thought Content 8 Structure 9 Mechanics 6 

It was a Sight worth seeing when the Troops marched past. 

Went thet marched past is was fun to watch then And a puch of solirden narched 
past 

And a puch of trump with flag and peoUe with there flag. 

And then cant a puch of boy and girls with ernes of flag. 

And there were twelte elenfeet. 


General Merit 49 

Thought Content 50 Structure 49 Mechanics 49 

It was a Sight Worth Seeing When the Troops Marched by. 

It would send a thnll right through your body to see the troops march by With 
the drums beatmg and the band playmg it would make any body wish to join in 
with the kaki clothed men ALL the soldiers looked like bunch of boys going to a 
Sunday school picnic instead of going to the gloomy trenches. It is wonderful to 
see the soldiers keep time. The soldiers look as if they can’t wait until they get 
over there. 


General Merit 76 

Thought Content 76 Structure 79 Mechanics 73 

It Was a Sight Worth Seeing When The Boys Marched By. 

The bo 3 rs were going It was hard to beheve. It was hard to realize ninety 
two boys from our own high school were marchmg before us for the last time before 
they went to France. Lme after Ime, and rank after rank passed us as we stood 
lookmg m amazement. 

The day was wonderful There was not a cloud to be seen in the light blue sky. 
There was a soft breeze from the south. The prime of our Indian summer was 
here and, the boys were marching past 

Mothers, sweethearts, sisters, fathers, and brothers stood watching them as they 
went by. It was a sorrowful day for many of the onlookers, because probly some 
of them said good-bye for the last time and never to be welcomed home again. 
Still in the heart of each mother there was a httle pnde which lifted their heads a 
tnfle higher. 

An aged woman stood next to me and she said to me, ** My but I hate to see Jimmy 
go He’s the only one I have left, but — but I’m sort of glad he is gomg, because 
its — its a wonderful thmg hes gomg to fight for ” Amidst her sobs she could say 
no more 

Many others stood crying A flag waved from every rank, from everyery win- 
dow, and from nearly every little child 

As dusk came on everything became quiite The boys boarded the train, the 
people went home, eveiything was quite, the boys were gone. 


(Arranged from Van Wagenen, 1923. By permission of the Educational Test 
Bureau, Minneapolis, Minn., and the World Book Go ) 
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scale of considerable interest is that of Stewart (1934), who published 
graded samples of news stories from high school journalism classes. 

Netzer (1938) began the standardization of an oral-composition 
scale for the fourth, fifth and sixth grades, after recording a large 
number of children’s responses to pictures, incomplete stories, and 
objects The children responded “best” to the objects, “next best” to 
the stories, and “least” to the pictures. Much more work along this 
line is needed. 

The task of grading answers to essay-type examination questions is 
similar to that of appraising English compositions Since this type of 
examination is probably the most common of all types, methods of 
grading it have been widely studied. A good many reports have ap- 
peared showing that one examiner usually differs considerably in 
assigning grades to essay-examination answers on two different oc- 
casions. Common self-reliability correlations, given by Stalnaker 
(1936), range from .40 to .60 in high school and college classes. Cor- 
relations between separate examiners have often been found to be 
less than these figures, owing to different standards of grading for 
facts, inferences, and grammar. Single readers of the Regents’ and of 
the College Entrance Board Examinations have, however, usually 
shown self-reliabilities of .90 or more, and similar correlations be- 
tween different readers are the rule. Stalnaker (1936) and Wright- 
stone (1938) have given detailed instructions for consistent grading. 
These include* 

1. The question should require only one very definite and restricted 
answer, such as a statement of fact, or of an attitude, or an interpretation. 

2. If several purposes or skills are to be graded, these should be graded 
one at a ume. 

3. The ideal answer should be carefully formulated and credits for partial 
answers agreed upon by all the judges. 

Traxler and Anderson (1935) found that, when grading was care- 
fully done, reliability of English essays of high school students was 
high, but that the retest-reliability of the pupils was relatively low 
over a short period of time Much research is needed to show the usual 
relationships among such language skills as vocabulary, grammar, 
reasoning, and style. 

Appraisals of Comprehension 

The task of measuring comprehension is usually much easier and 
less controversial than that of measuring expression, because a com- 
prehension test can use a multiple-choice technique and because ex- 
perts can usually agree upon the scoring. Literally hundreds of tests 
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of English Comprehension have been developed which can be scored 
by a clerk without knowledge of the subject, or even by a machine. 
Those tests which are widely used will be described under the head- 
ings Measurement of Word Knowledge, Written Symbols, and Meas- 
ures of Reading Ability. 

Measurement of Word Knowledge, Knowledge of the meanings 
of single words is basic to all language skills, hence tests of vocabulary 
have become an essential part of all achievement tests The only sure 
way to measure a person’s vocabulary is to ask him to define all pos- 
sible words. Since the unabridged dictionaries include more than 
half a million English words, this would take a long time. It has been 
found, however, that a fair estimate of a person’s vocabulary can be 
secured in a short time by the use of well selected tests. 

General word counts. In order to select important samples of 
words, a number of counts have been made of the words used most 
commonly in various communications. Thorndike (1927) and his stafE 
produced an alphabetical list of 10,000 most commonly used words. 
These were found by the tabulation of about 4,000,000 words from 
newspapers, magazines, classics, novels, correspondence, and text- 
books on common subjects. Each word was given an index number 
indicating its relative frequency. The Thorndike Century Junior 
Dictionary (1935), which lists and defines the 23,000 most common 
words, is an extension of this work to include a much larger sample 
of publications. A similar list of 10,000 words was produced by 
Horn (1926) based on the analysis of personal and business corre- 
spondence of adults, most of whom had received more than average 
education. Buckingham and Dolch (1936) have compiled a list of 
19,000 words selected from eleven other lists and marked to 
indicate grade difficulty. Although the general agreement among 
various word counts is marked, different studies have yielded some- 
what different relative frequencies of words. One of the reasons for 
discrepancies is the variation in words listed Thorndike listed verbs 
and nouns from the same stem, as contain and container, but not the 
modifiers, contained and containing. He included as distinct units 
words which had the same root but somewhat different meanings and 
different frequencies, such as constituency, constituent, constitute, 
constitutional, constitution, and constitutionality Persons who know 
one of such a series may be able to infer a correct meaning for most 
of the others. No thorough study has come to hand revealing the 
relationship between knowledge of roots and knowledge of words 
containing the roots. Likewise, since there is no standard and widely 
accepted method of counting words, it is not possible to speak of the 
size of a person’s vocabulary in standard terms. The number of basic 
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ideas needed for usual communication is much less than the number 
of words in Thorndike's list. 

Special woid counts. In order to examine vocabularies in special 
fields, word counts have been made of textbooks and other publica- 
tions in both physical and social sciences An excellent summary of 
master lists of terms used m the social sciences was reported by Kelley 
and Krey (1934) and their associates A list of 5,200 words is given, 
which includes terms used in government, law, civics, political sci- 
ence, economics, religion, and sociology 

Minimum vocabularies. Several interesting attempts to select 
basic vocabularies have been made m connection with the teaching 
of foreign languages and the writing of dictionaries. West (1935) 
found that an excellent English dictionary could be written using 
only 1,923 words, and that 1,106 words were enough for an adequate 
speaking vocabulary. In a comparison with seven other minimum 
vocabularies. West found that all eight lists included 2,219 different 
words. Some of the words which seemed important to West did not 
appear with great frequency in Thorndike's list, but in general there 
was a marked relationship between frequency and usefulness of words. 

Minimum vocabularies have also been devised for special fields of 
information. Pressey's (1934) description of the selection of items es- 
sential for the understanding of history is typical of the best work. 
She first made frequency counts of words in six widely used history 
tests. These combined with the published results of others resulted in 
a master list of 1,444 words. This master list was then presented to 
sixty-nine teachers of history in secondary schools and colleges with 
instructions to mark each word essential^ accessory, or unimpo'i tant 
The entire list was next rated by seven individuals especially trained 
in social studies who indicated their judgment of the values of each 
word outside of the history classroom Finally, a list of 415 words was 
selected all of which were frequently used and highly rated for both 
historical and sociological usage. Illustration 60 contains tlie words 
which were finally selected. 

In an earlier work Pressey (1924) prepared lists of basic concepts 
in fourteen fields The fields, which are not mutually exclusive, are: 


1. Grammar and Composition 
lish, French, Latin, and 
man 

2. Literature 

3. Arithmetic 

4. Algebra and Geometry 

5. History 

6. General Science 


Eng- 7 Biology 

Ger- 8 Chemistry 

9. Physics 

10. Physiology 

11. Home Economics 

12. Manual Training 

13. Art 

14. Music 
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The selection of essentifil terms at the high school and college level 
for many dilferent fields has now been completed by gioups of in- 
terested teachers 

Selection of words foi tests A tairly large number of tests hav'C 
been constructed to measuie cithei gencial or special vocabulaiies, 
using the following proccdiiic. A random sariijjlc ol about one hun- 
dred words is selected from a woid and test iiems ai e coiistiucicd 
about these words. The items aie iiied out on various age groups, 
and the order of difficulty i:> detennmed by noting ihc peiceiitage 
of persons who succeed in each item A final selection is made to 
secure a wide range of items and enougli items <it each le\el of dilfi- 
culty to give a fairly precise and consistent disrnmination between 
all persons tested. One ol the first widely used tests was made by 
Terman (1916), who began by selecting a word on every tenth page 
of an 18,000-word dictionaiv After pieliminary trials he selected two 
lists of fifty words each, to measure general vocabulary for those with 
mental ages of eight to nineteen y'cars. 1 ennan and Men ill (1937) 
selected from these two lists lorty-five words which seemed adequate 
to appraise general vocabulaiy for those wuth menial ages of horn 
six to twenty-two years. This (act seems the more reniaikable since 
a minimum of thirty wouh is needed to distinguish beti\ecn all oi 
these age levels. Usually only two woids aie needed to indicate one 
year's growth in mental age The reliability oi such vocaiiulary tests 
is generally high (.90 or more). Vocabulary tests aie typicallv the most 
consistent of all tests since they are not allecfed mucli by speed, jnac- 
tice, or adjustment to the test situation 

Thorndike (1926) used his word-Ii equency lists in devising a test 
of 110 Items divided into eleven ievoK oi ten items each The levels 
were scaled to represent etiuivalent steps, and the items in each level 
were chosen to be of practically the same dilliculty. 

Although the vocabulary tests of Terman and Thoindike were in- 
tended for use in appraising geneial intelligence they are in effect 
achievement tests, and tests similar to these hav'O been included iii 
nearly every appraisal oi educational achievement More than loity 
of these have been standardized on a national scale As a lule the 
general vocabulary tests have been found to coi relate as high as .70 
or more with total scores on achievement lests and intelligence tests 

If one examines the w'oids included in man) geneial vocabulaiy 
tests, he usually finds few terms from scientific or aitistic fields At 
the easier levels a large numbei of teims describing common objects 
and personal relations are found, and at the more difficult adult 
levels, literary, social, and business tciins. This selection doubtless 
reflects the frequency of woid usage in common communications and 
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ILLUS. 60 WORDS SELECTED AS THE ESSENTIAL CORE OF 
VOCABULARY IN HISTORY 


A Governmental Terms 


ambassador 

democracy 

bill 

appropriation 

autborities 

empire 

declaration 

appomtment 

consul 

federal 

decree 

budget 

governor 

lung 

government 

document 

coinage 

imperialism 

law 

currency 

minister 

monarchy 

legislation 

customs 

official 

republic 

self-government 

measure 

debt 

prune-nunister 

petition 

duty 

premier 

tyranny 

proclamation 

expenditures 

police 

union 

proposal 

greenback 

president 


provision 

mmt 

representative 

dty 

report 

protective 

secretary 

colony 

resolution 

revenue 

senator 

coimtry 

restnction 

tanfi 

sovereign 

county 

statute 

tax 

statesman 

dominion 


treasury 

vice-president 

nation 

abolish 

provmce 

abdicate 

doctrine 

assembly 

state 

adjourn 

issue 

bureau 

terntory 

annex 

pobey 

board 

town 

appoint 

reservation 

cabmet 

commission 

alhance 

authorize 

compromise 

centralization 

committee 

arbitration 

concede 

avil 

conference 

diplomacy 

conciliate 

civil service 

congress 

foreign 

confiscate 

domestic 

council 

international 

enact 

mtemal 

department 

negotiation 

enforce 

interstate 

House of Rep- 

neutrality 

grant 

local 

resentatives 

pact 

impeach 

mumcipal 

league 

peace 

inaugurate 

nullify 

states rights 

legislature 

powers 

parbament 

reaprocity 

ratify 

administration 

senate 

treaty 

repeal 

regime 

session 

amendment 

repudiate 

sanction 

capitol 

anardiy 

article 

veto 

patnotism 

commonwealth 

communism 

charter 

constitution 

executive 

prohibition 

confederacy 


legislative 

reconstruction 

despotism 

act 

judiciary 

referendum 


B PoLmcAL Terms: 


campaign 

candidate 

anti-slavery 

majority 

ballot 

abolitionist 

minonty 

election 

caucus 

democrat 

unammous 

polls 

convention 

federalist 


primary 

deadlock 

political party 

conservative 

sufirage 

delegate 

progressive 

partisan 

vote 

nominate 

republican 

radical 


opponent 

plmik 

soaalist 

whig 

lobbying 


platform 

politics 


patronage 
spoils systrai 


ticket 




C Economic Terms; 


business 

manufacture 

commerce 

merchandise 

commodity 

production 

company 

property 

competition 

raw material 

consumer 

rebate 

exploit 

shipping 

export 

trade 

factory 

|:oods 

corporation 

unport 

monopoly 

industry 

trust 


employee 

infiation 

employer 

investment 

labor 

market 

strike 

panic 

umon 

bank 

speculation 

stocks 

bankrupt 

communication 

bond 

public utilities 

capital 

credit 

transportation 

cnsis 

prospenty 

depreciation 

finance 

wealth 
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ILLUS 60 WOROS Sl'IECII'D VS THl ESStN TTAL CORE OF 
VOf.VBl'LARV IN HlSTOR\ (C»'it'd) 

D Socioi.OGic\L Tliuis. 


aristocrat 

peasant 

slave 

soaety 

community 

homestead 

pioneer 

plantation 

settlement 


£ ISGAL Terus: 

arbitrary 

illegal 

invahd 

justice 

legal 

nghts 

tmconstitutional 

alien 

citizen 

exile 


rural 

urban 

census 

iniiabilants 

population 

negro 

rare 

mob 

not 


nationality 

native 

naturalisation 

appeal 

case 

com ict 

crime 

decision 

execution 

injunction 

judge 


F MniTARY Tlkus: 


allies 

navy 

belligerents 

ofheer 

enemy 

Hostile 

reinforcements 

service 

pirate 

tioops 

submarine 

army 

commander 

marine 

confederate 

znihlia 

cruiser 

general 

Beet 

rcc’-uit 

soldier 

veteran 

forces 

naval 

volunteer 


G Geographic \L Ttrhs- 


emigration 

expansion 

immigration 

migration 


education 

ii.stiLution 

irveniion 

reform 


emancipation 

freedom 

inde|K.ndcnee 

libcrlv 

oppression 


people 

prxatc 

publu 

public opinion 
standard of living 


jiiti' 

fraud 

graft 

testimony 

verdict 

conspiracy 

violation 

insurrec tion 

witness 

re bclUon 

COU't 

revolt 
rev olii lion 

junsdii tion 

sec e'-sion 

sup’^eme court 

sedition 

bnbeiy 

smuggling 

trcaauu 


corruption 


draft 

invasion 

mobilization 

massacre 

military 

aggrceiion 

munitions 

attack 

ocrupetion 

battle 

offcmcive 

blockade 

siege 

bomba id meat 

strategic 

campaign 

surrender 

contraband 

V iclory 

detensiv e 
cmbaigo 

war 

evacuation 

armi-ifice 

fortibealion 

disarmamei 

indemnity 

reparations 


boundary 

district 

exploration 

agriculture 

continent 

region 

navigation 

irrigation 

continental 

section 

voyaire 

rectamaiion 

coast 

frontier 

discov er 

conscrv alien 

Fan-Amencan 

praine 

expedition 

natural resuiirees 


Religious Tcrhs: 

clergy 

deromination 

catbolicism 

papacy 

missionary 

heresv 

proteslar usm 

pope 

creed 

intolerance 

persecution 

crusade 

tolerance 

Xerus Referring to CiIROvoloc\ \nd Records 

ancient 

history 

rccorcTs 

civilized 

century 

era 

primitive 

current 

decade 

event 

modern 

prop.>ganda 

movement 

medieval 

period 

publicity 

pi' ecdi lit 



tradition 


(Pressey, 1934, p 186 By permission of Charles Scribner's Sons ) 
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also the occurrence of words in dictionaries Many examiners have, 
however, believed that tests of general vocabulary were more favor- 
able to those with academic interests than to those with mechanical, 
agricultural, artistic or scientific interests This belief has doubtless 
been a forc^^ in the creation of a number of tests of special vocabulary. 
Teachers in special fields have also desired such tests 

The procedure for constructing tests in special fields is similar to 
that used for designing general-vocabulary tests. A special master list 
is first obtained. Then the selection of words is made in such a way 
that an adequate sample of each level of frequency is thought to be 
included The adequacy of the sample depends upon the use to be 
made of the test After a test has been applied, the adequacy of the 
whole test, or of each separate item, in discriminating between in- 
dividuals may be ascertained by the methods outlined in Chapters 
III and IV. 

The Iowa Silent Reading Test, by Greene and Jorgensen (1943), 
includes four separate vocabulary tests social science (20 items), 
physical science (15 items), mathematics (15 items), and English 
literature and grammar (20 items). 

The Progressive Achievement Tests, Intermediate and Advanced, 
by Tiegs and Clark, include separate vocabularies for the same four 
fields, each of which consists of twenty-five items Total vocabulary 
scores are secured by adding the four subtest scores (Ulus. 65 A). 

The Michigan Vocabulary Profile Tests, Greene (1949), include 
eight divisions of terms 

1 Human relations* mental and social processes and situations 

2 Commerce business, manufacture, sales, economics 

3 Government, legislative, executive, judicial 

4 Physical sciences physics, chemistry, mediamcs 

5 Biological sciences zoology, anatomy, pathology 

6. Mathematics arithmetic, algebra, geometry, trigonometry 

7. Fine arts* plastic, graphic, architecture 

8 Sports: ten most common sports whicli adults play 

Each division of the battery consists of thirty diflEerent items ar- 
ranged on ten levels of diflSculty. The items range m difficulty from 
those passed by at least 98 per cent of a group of college sophomores 
to those passed by 2 per cent or less of the same group. The gradation 
in difficulty of the items is such as to yield a test of high discrimina- 
tive capacity at the senior high school and college level. 

Each item consists of a definition and four words or phrases, only 
one of which is completely and accurately defined or described. The 
subject is asked to select the one which he thinks is correct. Items 
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which correlated less than .30 with total scores in their di\'ision iv^ei'e 
eliminated. 

An attempt was made to eliminate all items in winch the light 
answer could be found solely by reasoning from kno\\ ledge ol loois 
and prefixes or by eliminating wrong answers. This aiLcnipt was 
not entirely successful, but the number of items of this sou has been 
reduced by using the same prefixes and roots more than once in an 
item and by selecting wrong answers which were nearly, but nor quite, 
synonymous with the right answer. It was desired to make an inlorina- 
tion test which would be affected as little as possible by reasoning 

Reliabilities as indicated by correlations between two equivalent 
forms of 30-item tests given one week apart are shown in italics in 
Ulus. 61, These range from .78 to .94, with a median ot 80 These 
figures show that longer tests are not necessary for laiily accurate 
individual predictions. If more accurate predictions aie needed lor 
a specific purpose, both forms might be used 

Illustration 61 also shows the intercorrelations among tlic \aiious 
divisions of the Vocabulary Profile Test for a group oL libeial arts 
college sophomores. Results for other grade groups were substanrially 
the same. These correlations are all below .55, with a median cor- 
relation of .27, The divisions thus show a large degree of independ- 
ence and would appear even more independent in a lai g e uiiselecicd 
population Practically zero correlations are found between scoies 
in fine arts and scores in commerce, government, and ph)sical sci- 
ences Scores in physical sciences correlated approximately 50 with 
scores in biological sciences and mathematics. These figuies indicate 
the presence of a number of fairly well-isolated factois Psycholog- 
ically there is little evidence for any functional relationsluji bciwecn 
the information in any two of these divisions, with the exception of 
mathematics, which is needed as a tool subject in many fields oL hu- 
man thinking. 

Illustration 62 presents graphically the score of Henry Brown on 
the Michigan Vocabulary Profile Test It appears that he was above 
the second-year college average in all except two fields fine ai ts and 
sports. His highest scores were in physical sciences, whcie he ex- 
ceeded the scores made by the lowest 98 per cent of the group He w’as 
also in the high 10 per cent of the group in biological sciences, mathe- 
matics, and total test scores. Such profiles are valuable for appraising 
the technical information which a person has, and also predict fairly 
well his reading and composition skills in various fields 

Total Michigan Vocabulary Scores correlated .56 wiili the vocabu- 
lary section of the cooperative English Test and 61 with the vocabu- 
lary section of the American Council on Education Psychological Ex- 
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ammation for College Freshmen Since all these tests have high self- 
correlations, the conclusion must be reached that they are measuring 
difEerent and somewhat unrelated fields of information. 

Written Symbols, In this division many well-prepared tests of 
spelling, grammar, punctuation, and sentence structure are to be 
found Nearly all of these are composed of items which contain errors. 
The student is asked to detect the errors and in some instances to 
correct them 


ILLUS 62 MICHIGAN VOCABULARY PROFILE SCORES OF 
HENRY BROWN 


Divisions 


Standard Score 

30 35 40 45 50 55 60 65 70 75 30 


Percentiles 

3 7 16 30 50 70 SH- 93 98 


1. Human 

Relations 


2. Commerce 

Business 

3. Government 

Legal 


4, Physical 
Sciences 


5. Biological 
Sciences 


6. Mathematics 


7. Fine Arts 


8. Sports 


Total 



Norms for Second- Year College 
(Greene, 1949) 


English usage The selection of items for tests of English usage 
has usually followed an investigation to reveal prevalent errors. The 
pioneer work of Charters (1920) shifted the emphasis from the rough 
scoring of papers to systematic diagnosis and remedial work. His 
studies showed that the misuse of fourteen common verbs gave rise to 
nearly 60 per cent of all oral errors These errors were made by only 45 
per cent of the students, hence individual rather than group treat- 
ment was suggested. 
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O’Rourke (1934), from a nation-wide study of English usage in 
the seventh through the twelfth grades, found that good sentence 
structure w^as the most difficult problem at all levels, that careless 
omissions or repetitions came second, and that ambiguous meaning 
came third Errois m verb forms were uncommon, contrary to the 
reports of errors in the lower grades. 

Illustration 63 presents parts of a typical test of language usage by 
Barrett et al (1938) which yields scores for the three parts* 

I Sentence Structure and Diction (30 items, 10 minutes) 

II Giammatical Forms (35 items, 20 minutes) 

III. Punctuation (30 items, 10 minutes) 

ILLUS 63. BARRETT-RYAN-SCHRAMMEL ENGLISH TEST 

A 

TEST: FORM A 

For Grades 9 to 12 and College 

PART 1 SENTENCE STRTJCTtIRE AND DICTION 

DiftSOnoNS In the following paragraphs some expressions are underlined CThe ei^ression may be a word or 
a group of words ) If the expression is nghtly used and n^tly placed, make a heavy mark like this | m the 
space tetween the dots under R on the Answer Sheet If the expression is either wrongly used or wrongly placed, Q 

n^e a heaAiy mark m the space under W on the Answer Sheet, as shown in the sample (See the sample answer ^ 

on the Answer Sheet } 

Sakpue Even though you don’t succeed at first, you had ought to try agam , 

j k 

la the senior class were^uTof us boys who ranked hud> in sctolarship and who wanted to . 
go to college Hve of us could not of gone even for one year without we worked not only for our » , 

living expenses but ^also for our tuition and books Wishing to get work, it was our plan to . . 

wnte to several colleges and asking what our chances were for employment Having received . . 

encouragmg letters from one of the colleges, three of us deaded to attend that college The . ... 


PART IL GRAMMATICAL FORMS 

Dise^oks In each numbered portion of the story below there » an underlined word Some of these underlmcd words 
are n^t and some are wrong If the word is nght, make a mark under R on the Answer Sheet If the word is wrong, Paga 
as diown in the sample below, make a mark under W (See the mark under W on the Answer Sheet ) Then look at the a 
three items number^ 1 , 2, and 3, one of which names the correct form to be used Only one of these items is the nght HL 
explanation Choose the nght one, and make a mark on the Answer Sheet under the number of that item * 


- fWho done the work on the blackboard . 

Saupu «. i , -T—; b 

1 yesterday? 

[ 1 past partieiple 

2 past tense \/ 

! 3 present tense 

■.k 

1 

r Before school closed m June ^ girls were 2 

1 making plans to spend part of our vacation camp* ] 

r 1 possessive case, to modify gxrfs 

2 nominative case, subject of were makmg 
[ 3 objaedve case, in apposition with girls 

1 

9 1 

r mg out It was left to Jane and to get a ^ I 

1 chaperon, for we must have one After much ] 

1 nommative case, subject of g*t 

2 objective case, object of was Isft 
[ 3 objective ease, object of to 

1 ’ * 

• 1 

r dehberation as to whom of our teachers would ^ 1 

[ like to spend two weeks m camp, Jane suggested | 

[ 1 objective case, to agree with teachers 

2 nommafive case, subject of would hk* 

[ 3 objective case, object of to 

1 


(Piepared by E R Barrett, Teresa M Ryan, and H E Schrammel, Kansas State 
Teachers College, Emporia, Kansas Copyright, 1938, By permission of the 

World Book Co) 
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Scoring is simplified by having rhe student place all his marks on an 
answer sheet, '^shich may be checked quickly In this test some ex- 
pressions in the printed text are undeilincd 1 he task is to dcienrune 
whether oi not the untlcrlined expressions arc rightly used. In Pai t II 
the student is asked to describe the coirect lonii when a wrong gram- 
matical ioim has been used The test alloi\s lanly liberal peiiods 
of i\oik. Centile ranks aie fuinislied which show means ol 8S, 96, 103, 
10(), and 109 for rhe ninth, tenth, eleventh, and tucllth grades and 
college heshmcii icspectivcl) Over lapping between the niiiih grade 
and college iieshmen, which is very maiked, is topical of reports ol 
this sort 

Illustration bi is ol a piooLreading type ol test by Wilson (1923) 


n ns 61 wnsos i vngi vgj*: irror tlst 


Di'trclions This is a icst to sec if sou can coucct ihc niisiakcs that a pupil has 
made 111 wiitin*? a shou stoi) (\ shou saiupk* is then gi\cn and concctcd) That 
is the \\av you an' to do m ilii'. test I hcie die iliiec slories in this folder, Iiiit >011 
are to take onU the hist one unless \oui ic.uhci tells sou thncifiitlv Sou aic to 
coriccr all tlic mistakes in that stoi\ {List as it lias been done 111 the sample Dinw a 
line ihiough each uiong uoid and wiite the coiicci woicl above' it Uc \ci> careful 
to (oiiecL e\eiy misiake Woik at )oui usual laie You will be gi\en tunc enough 
to finish unless }ou aze vei> slow Be a good «>poit, do soiii best and plas fair 


STORY A . 

&inn& « . , 

Saturday Afornuig 

V v' Salurd.ij mo*nii^ w a busy time!£*nrc houvi feller bas a good clmncc to work. 




■ M e- and Porol li y divide Ihc tasks between us Tlua we race Lo vee who mil finish 
^ fir»t Lost b.iturd ij I IwUi^the brcakfavl dishes as onr of mv tasks I amcynrr^ 

^ fund of wa-ihing diahes You should ha\c saw me work I wanted to gel through 

so ns T could plaj 

V Johnju r<dlcd up at eleven o'clock to see if I might pinj with him I had-t 

V rooms to dust bifore I could go John «nw tint I couldn’t 1i irdh leave my work 

— — — — — - 

until I hadtfad-all of it lie brought over some douglmuts and gave tlicin to me 

1 sure appreciated the douglipuls Then John licipid me It was real good of 

him IMiiii we had finished, 1 sugfi^skd playing marbles until tune for dinner 
» I Irwit got no marbles,*' said Jolin “Thtv pwmrs vera handj," I replied Tlien 


/6 


1^ V^ 


✓ 1 Jhm him some of mine I had to mans for bag John and I enjoy marbles 

y When dinner was rcaif^, niulher n\ lUd John to stay “If I was sun* inv molher 

wouldn't care, I should like lo slay,’’ he n pliei! John »*«i that he was rcalh wanU d 

so he telephoned to his mother He enjoyed the diMier aiid*H hcaitilj, When 
ifS. 

* * thorn apples was passed, John w anted one, hut he couldn’t catjip more After dinner 
V we had another game of marble t I hopes John may come over again 


✓ s' 

✓ 




(Wilson, I 02 S. By pei mission of ilic Wkirlcl Book Co) 
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in which twenty-eight grammatical errors are to be corrected Median 
grade scores rise from 7 for grade three, to 22 at grade eight, and to 
26 at grade twelve. 

The question is often asked do tests of the proofreading variety 
measure one’s skill as well as scores taken from an actual composition? 
Willing (1926) made a comparison between a proofreading test con- 
taining 180 errors and an original composition The composition re- 
vealed 12 errors per pupil, and the test, 59 errors Although there 
were certain errors in the composition not lepresented in the test, it 
seems clear that the test was a more comprehensive instrument 
Both kinds of appraisals should be used for remedial work, and con- 
siderable research is needed to find usual relationships between them 
A few diagnostic tests have been prepared from which a detailed 
check of the specific types of errors made by a student can be secured 
Tiegs and Clark (1934) published the Progressive Achievement Tests 
Series, which contains the analysis of language errors shown in Ulus. 
65B (lower part of the illustration) Seven situations in which errors 
in Capitalization may occur are listed for remedial action, five in 
Punctuation, five in Words and Sentences, and 1 1 in Grammar Al- 
though there are not enough items in each one of these subdivisions 
to yield reliable indications of special types of errors, the technique 
of diagnosis is probably sufficient for a great many situations In Illus. 
65 A total scores for Capitalization, Punctuation, Words and Sen- 
tences, Grammar, Spelling, and Handwriting and a total language 
score are given in profile form for this same test. It appears that 
James Brown was considerably above average in Punctuation and 
Grammar, near the average in Capitalization and Words and Sen- 
tences, and far below average in Spelling and Handwriting. 

Measures of spelling. Investigations of spelling difficulties, which 
are fairly numerous, are typified by the reports of Gates (1922) and 
Wheat (1932) Both studies agree that carelessness is a major source 
of error and that great improvements come from well-motivated 
drills. The six most common types of errors are 
Phonetic — Wensday for Wednesday, kite for height 
Use of the vowels — acheive for achieve 
Double letters — leter for letter 
Omissions of silent letters and central syllables 
Substitution — goiny for going 
Mispronunciations — chtmley for chimney 
Probably all except the last of these types of errors are due to faulty 
visual memory 

For testing purposes the standard scales are nearly all of the survey 
variety m which items of increasing difficulty are presented. A typical 
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example is the Ayres (1915) scale, which was made from a list of one 
thousand common words. After the words had been presented for 
spelling to large grade groups, the results were tabulated to show 
the per cent of each grade which succeeded in spelling each word. 
Words of approximately equal difficulty were grouped together as 


ILLUS 65A PROGRESSIVE ACHIEVEMENT TESTS— ADVANCED 
BATTERY FORM A, HIGH SCHOOL AND COLLEGE 

XDUignosttc Tests keyed to the Curriculum) 

Devised by Ernest W Tiegs, Dean. University College, University of Southern California, 
and Willis W Clark, Director of Administrative Research, Los Angeles County Schools. 

Name. G,, jg /o 

School Birthday. 

Teacher. Date.../^..z/.5?..- 3 ?..Sex<5).F 

DIAGNOSTIC PKOFILE 


TEST SUBJECT 

rtrcmrllt 

fetsibU Popir* Rank for 

Seora Scon Cndt:. 

1. Reading Vocabulary . • 

100 SO 

A. Mathematics .... 

. 25-ii.. ._dM. 

B. Science 

. 25 /D . . 

C Social Science . . . 

. 25 -ii-. .-is. 

D. Literature . 

. 25 -ZR.. 

Z Reading Comprehension 

‘ 55 3Z ^0 

L Following Directions . 

. io-£.. 

F. Organization .... 

. 15_6_. .J3- 

G Interpretations « . 

30 .fLi - . .-55, 

3. Mathematical Reasoning 

• 60 

A Number Concept . . 

, 20-ZZ-. .-20- 

B Symbols and Rules . . 

. 15 -i* 

C Numbers and Equations 

. 10-2-. ._2^ 

D Problems . ... 

, 15^. 

4 Math Fundamentals • 

. 80 iljiO 

E Addition .... 

. 20-151. .JX 

F. Subtraction .... 

2D-LL, ,A=iL 

G. Multiplication • • . 

20 -I-. .Ik, 

H. Division .... 

. 2o2o_. . 45 : 

Si. Language 

125 IJh 30 

A. Capitalization . . . 

. 15 .Jtz 

B Punctuation . . . 

10 

C Words and Sentences . 

25^. .AK 

D. Grammar 

. 30 2^. ._2^ 

E Spelling 

. 30_£1 .-jSI 

F Handwriting .... 

. 15-X-. 2K 

TOTAL . 

420 Z2>% A£, 
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ILLUS. 65B DIAGNOSTIC ANALYSIS OF LEARNING DIFFICULTIES 


If tfia diagnostic profile on the first page of this test indicates that the pupil is making normal progress in all 
fields, the teacher will have no use for the following diagnostic analysis However, where the diagnostic profile 
shows achievement holow a desirable standard in one or more major fields, the following device will assist in isolat- 
ing and analyaing the speafic causes of difficulty as a basis for remedial instruction 

The numerals and capital letters in the diagnostic analysis correspond to the sections of the test simi- 
larly mafked For example, if the diagnostic profile shows unsatisfactory achievement in Test 4, Sec E (addition 
in anthmetie fundamentals) an inspection of the unsatisfactory responses in this section of the test (by number) 
will reveal whether or not remedial instruction is needed in carrying, use of zeros, reducing to common denomina- 
tors, and the like lliese topics are then checked by the teacher as the, basis for remedial work 

Once an adequate diagnosis has been made, remedial instruction is frequently a simple matter However, 
teachers have in the past found the clerical work incident to following each individual pupil a heavy burden Such 
extra work is almost completely eliminated if this diagnostic analysis is tom from the test booklet and kept on the 
teacher’s desk, where the various items may be checked off as the pupjl masters them 

READING 


1 Reading Vocabulary 

A. MATHiMATICS ^ 

voeibttlify . 1-25 

I SCIENCI ^ 

Basie vaabutsry._ 

C SOCIAL SCIENCE 

vocabttitry... SUJ 1-S 
0 LITERATURE ^ 

voeabuhry .. 1-25 


2* Readmg Comprehension 
E FOLLOWING SPECIFIC 
DIRECTIONS 


C INTERPRETATION OP 
MEANINGS 


malical aitiiatlons ,1. ]0 
dafinitiotta and 
fellowmg diraelKiii 3,4,6 7 8 
F ORGANIZATION 
— ..Voeabnlafy 

- .Uw at index 

refnraneat TjL 
e-p*** ouriine .. ..... 


—Sefeeting topic er 
central idna .1^ 

—Understanding direcHy 
stated facts ‘ “ 


— Mabi 

15, 


ited facts 4,5.7 8.11,13. 
13, 14. 18, 22. 26, 28, 29 
inferaneas 2, 3, 6, 9, 
■ 1942? 21. 23,_24, 


3 Mathemetical Reasoning 

A. NUMBER CONCEPT 

-Wrirtiif integers 


..1-3 

.Writing moncr — 4 

.Writing fractions _ 

.Roman numbers . 8-10 


MATHEMATICS 

4 Mathematical Fundamentals 

L ADDITION 

I combinations — .. 1 
ng 2.4 


lactioni and domicals 
.Exponnnts and roots. . 
.Negative numbers - — ... 

•' ■ is:! 


SYMBOLS AND RULES 
. Symbols 
—.—Vocabulary — — 


..Negative numbers—.. 
Jiolving eouarlons 



1,6 

addition . —3, 4 

maney « —.4,6 

.Denominate numbeis - —4:6 


11-15 

C NUMBERS AND EQUATIONS 



- fractiens and 

decimal! — . 


. 1-4 
5-10 


, 1-2 


.Writing decimals in 


Uiding percentages . 
.. ... Adding abstract nos- 
F SUBTRACTION 

imbia 


-.16-17 

18 

-.19-20 


Sharing and averaging —3-4 

——Stuart and cubic 

— —Insurenct and discount-. 13-15 
—Ratio and porcontage —Till 
Budgeting — — ——.12 



tinjs ni 

—Reducing fractiens to 
common denominators 
—Borrowing with mixed 
numbers 12, 13 


Subtracting fractions 
' n decimals 14J5 


—Writing decimals in 


MULTIPLICATION 
Tablu 


- , - 16,17 
-19.20 


— .....Reducing fractions to 

on denom — -f r 
r mixed nos -—-.10-13 

-.liis 


-1-5 

'eros in multiplicand— 2,5 
Znros In multiplier — 4.S 


——Two-place multipliers 
— — Caneollatien of frac- 


. _ 7.9. 10. 11. 13 
id mneJ 


Fractions and docimals, 

Pointing off decimals _ 

—Mult lAstraet nos.. — 
H DIVISION 

.Tables 

‘leros in quotient... 


— Remainders 

—Inverting divisor m 
fractions 


— 1-5 
T-4 


—Reducing fractions to 
decimals — _ 

—Pointing off decimals. . 
— Oiv abstract nos — 


■M 

:.sl 


S Language 

A. CAPITALIZATION 

—First word of sentence— « T 
—Names of penens 2.7.9 


Names of places; — Z 3,0 
-^ys of week and months ^ 

~ t word ef qvotsHon— 6 

■ ' ■ n — ........X,- 

t PUNCTUATION. 


LANGUAGE 

. Quotation within quotation — 

■ Over punctuation — .... — 

C WORDS AND SENTENCES 

Singulars and plurals- 1,8,11 

^Tonse" . 1^4, *7, 8, 9, IJUfi 

Good usage T— 3 

Recognamg wntences —.16-23 

D GRAMMAR 

^Vocabulary 1-7 

—Parts of Bontencas;— ..l-IO 
. K ind of sentences— 1I-T3 

—Parts of ipaaeli 14-3D. 

... 2 -.. 


-Verbs - » 
-Adiectwes 


— Conjunetlent 


E S^N^^ 


F HANDWRITING j _ 

yb^Qualily and lagibintyyifrt 


(This and the material reproduced on the preceding page, copyright, 1934, by 
E. W. Tiegs and W IV Clark By permission of the Southern California School 
Book Depository, Los Angeles, California, and the authors ) 


shown in Ulus. 66, where levels G, O, and Y are presented along with 
the per cents of words spelled correctly by each grade. 
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ILLUS 66 BUCKINGHAM EXTENSION OF AYRES SPIXI INC SC VLE 


Levels 


Grade 

G 


0 


Y 

II 

84 


27 


0 

III. 

94 


50 


2 

IV. 

99 


73 


8 

V. 



84 


16 

VI 



92 


27 

VII 



96 


42 

vni 



99 


58 

rx. 





73 


by 

eight 

remain 

lemon 

dcci‘>ion 


have 

afraid 

direct 

laugh tci 

principle 


are 

uncle 

appear 

lying 

accommodate 


had 

rather 

liberty 

mountains 

accuracy 


over 

comfort 

enough 

nails 

countcifcit 


must 

elect 

fact 

needle 

dessert 


make 

aboard 

board 

nobody 

digestible 


school 

jail 

September 

oar 

immense 


street 

shed 

station 

palace 

Icopcaul 


say 

retire 

attend 

penny 

maimalade 


come 

refuse 

between 

pitcher 

millionaue 


hand 

district 

public 

regular 

mucilage 


nng 

restram 

friend 

repeals 

orchcstiA 


live 

royal 

durmg 

repio\ c 

pailiamcnt 


kill 

objection 

through 

sailor 

perceived 


late 

pleasure 

police 

sentence 

possess 


let 

navy 

until 

shinirg 

precipice 


big 

fourth 

madam 

surface 

iccommendod 


mother 

population 

truly 

sweeping 

resemblance 


three 

proper 

whole 

sweeps 

restaurant 


land 

judge 

address 

thief 

seized 


cold 

weather 

request 

waist 

superintendent 


hot 

worth 

raise 

waiting 

surgeon 


hat 

contain 

August 

weary 

thoroughly 


(iuld 

figure 

Tuesday 

wrilmg 



ice 

sudden 

struck 




play 

forty 

getting 




sea 

instead 

don’t 




bread 

throw 

Thursday 




come 

personal 

canoe 




eats 

everythmg 

captain 




food 

rate 

cellar 





chief 

clothes 





perfect 

covered 





second 

creature 





slide 

curtain 





farther 

declared 





duty 

distance 





intend 

double 





company 

explain 





quite 

fields 





none 

floated 





knew 

holiday 




(Buckingham, 1927 By permission of the Public School Publishing Co) 
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Methods of examining one’s spelling ability include both recall 
and recognition. Dictation is one of the commonest methods and one 
of the most searching, since one has greater difficulty in recalling than 
in recognizing facts The Stanford Achievement Test attempts to 
measure usual spelling habits by not telling the students that the 
dictation passage is a spelling test 

Another method oi examination presents a printed series of words 
and asks the student to correct those that are misspelled The Iowa 
Placement Examinations, Series E T. I. (1925), present fifty words, 
of which twenty-five misspelled words are to be correctly written. 
The first ten of these are 

. . acceptance . . disagreeable 

appieciate . experiance 

begming . . . evidantly 

* . . . confirming niece 

crocheting genuine 

A third method of examination is illustrated by the Columbia 
Research Bureau English Test (1926). A word is printed four times, 
once correctly spelled and three times incorrectly, thus 

1. fifty 2 fivety 3 fifety 4 fivty 

1. wissdom 2 wisedom 3. wisdom 4 wisdome 

1 vanety 2 vanity 3. vinety 4. vanaty 

The task is to select the correctly spelled form. 

A fourth method of examination attempts to parallel dictation by 
presenting words written in a phonetic style, and then asking the 
examinee to give the approved English spelling, thus, espeshally for 
especially and biznes for business. 

Research is needed to evaluate the relative advantages of these 
methods of examination If enough items of a caiefully graded sort 
are given, the results of one type of examination usually correlate 
highly with those of others. 

Measures of Reading Ability. Since the vanety of reading activi- 
ties IS large, many types of reading tests have been constructed. 
Nearly all reading tests, however, involve four language factors 
which seem to be somewhat independent Two of these have just 
been discussed — ^knowledge of vocabulary and of techniques of usage 
and spelling The other two factors are often called perceptual span 
and inference. The most common tests of reading, called general 
tests, demand all of these four factors m unknown amounts, but in 
diagnostic testing it is possible to control three factors and then to 
measure the fourth m fairly pure form. The analytical approach is 
important when a person is deficient in only one or two aspects, since 
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il IS desirable to [Dio\ic[e a special leiiiedy Both general and diag- 
nostic leasts will be de^cTibed (See Appendix 11.) 

Reading tesL situations aie complicated by the lact that both speed 
and accLirac\, which aic olien mcompatihlc, ^ue iiequently desiiable 
aspects ol achie\eineii( Since it is diflicult to hold eithci acciirac\ or 
speed constant, the evaluation oL the^e two a'»pecis is a persistent 
pioiilcni 

Ccne’ial leading tests Silent-reading tests have uMiallv taken the 
foiiu oi .1 series ol sentences oi paiagiaphs, each ol v\hich is lollow'ed 
by one oi inoie t[iiestions \s the test piogresscs the items become 
longci and more involved, the vocabulaiy hauler, and the cinestioiis 
more complex 

Diagnostic tests Diagnostic tests will be considered under thiee 
headings (a) tests ol simple compiehension, {b) tests wdiich empha- 
size vcibal icasonmg and oigani/ation ol ideas, and (c) tests which 
cmphasi/e perceptual speed and sj^an 1 he fust two ol these are 
shown m llJus GOB (j^ 178) with subheadings which yield a detailed 
analvsis ol James Browns recoul Jf appeals Iroin the iindci lined 
numbeis, which indicate errois, that he did lanl) well cm simple 
comprehension items, but very jDOoily on oiganization and inter- 
pietation items. Illustrative items will be given fioin various tests 

1) Simple Co7n prehension. Here the task is to answer questions 
about iacts stated in fhe context Thus, the Sangien-Wooclv (1927) 
Reading Test, Part III, consists of Fact Materials and Part VI, oi 
Following Directions Portions ol these are printed in Tlliu 67 

2) Veibnl Reasoning and Oiganization Some ol the tests in this 
group ask one to draw inlerences trom a paiagiaph Ihese are il- 
lustrated by the Gates Silent Reading "J ests (1926) which yield scores 
for (rt) appieciatiiig general signihcance, {b) pieclicting the outcome 
of given events (c) understancling precise directions, and (c/) noting 
details The last two oi these seem to emphasize compiehension oi 
stated facts, but the first two (Ulus 68) lequirc infeiences as well 
These cannot be consicleiecl to be pure leasonmg test^ because the 
vocabulary becomes more difficult as the test progresses 

Tiie purest form oi a verbal-reasoning lest is thought to be syl- 
logistic (flhis 11) \lthough such Loi ms are rarely found in achieve- 
ment-test batteries, they have the advantage ol being relatively free 
from vocabulary vauations A dilTicult syllogistic test can be made in 
v\hicli only a third grade vocabulary is used 

Tests of reoigani/ation use material which has been disarranged 
with instructions that it be properly ananged Disarranged sentences 
aie shown in Ulus 69 \ and disarranged paragraphs in Ulus GOB 
Analvsis oi factors leading to success in these tests is very difficult 
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when the vocabulary load becomes greater as the reorganization be- 
comes harder One person may fail the test because of a poor vocabu- 
lary, another because of poor inferences 

ILLUS. 67 SAMPLES FROM THE SANGREN- WOODY READING TEST 
Part m. Fact Material 

Directions: Write the answer to each question on the dotted line Use one word 
if possible. 


The “lead ” m your pencil is not made 
of lead Long ago people had lead m 
their pencils , that is probably why the 
pencils we use are called lead pencils 
Another mineral called “ graphite is 
now used This mineral is taken from 
mines in the same way as coal or iron 
ore. 


1 What did people use m their pencils 
long ago? 

2 What mineral is used in the pencils 

now? 

3. From what is the mineral taken ? . . . 


Part VI. Following Directions 
Directions • Do what each paragraph tells you to do. 

1 At the nght are two squares of different sizes The 
larger square is a playground for children, and the smaller 
one is a garden into which children must not go There 
should be a fence between the playground and the garden. 

Make this fence by drawing a Ime to separate the squares 

2 At the right are six circles They stand for one-half 
dozen eggs The second egg in the row is not a good one and 
cannot be used for cooking In order that Mother will not 
make a mistake and use it, you must take your pencil and 
mark it with a cross 

(Sangren and Woody, 1927. By permission of the World Book Co.) 

Tests of organization of material are shown in Ulus. 69. A para- 
graph is printed with its phrases numbered consecutively. The 
student is asked to select the most important items in the paragraph 
and arrange them in an outline The same type of test without num- 
bered phrases is much more difficult to score and probably requires 
more organizing ability than the numbered form. Much research is 
needed to determine and appraise varieties of organizing ability. 

3) Perceptual Speed and Span. The amount of written material 
which a person comprehends at a glance is considered to be an im- 
portant aspect of reading. One rough technique for measuring per- 
ceptual speed is the flash card which is used in the earlier grades. At 


□ 


oooooo 
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ILLUS 68 SWJl’LtS FROM J irr GVIES SILENT REVDTNG '1 EST 

Typf a To Appkecimj im: Glxervl SIG^^l^cv^cL 

This is to bp a ion.(ling Lost You aie to read a niiuiber of paragr.iphs Rclo\^ each 
paragraph arc five \\oids One of the \\oids tells how some one dc'-enbod in the 
paragiaph felt “whether ‘•ad oi happ3^ etc You should diaw a line under that 
one — and only one — woid to show that you undeistand just how the person 
deserbed in the paragiaph did feel Now let us try a sample befom we begin the 
rdiil tC'at Read the following paragrajih and then draw a line under the word 
which you think tells best how the person felt 


Once upon a tune a > oung faiiy w on I dow n to tlie iiver to sw im She jumped 
in with a splash She put out her hniicls and tried haid to swim bomc'thing 
seemed to be dragging hci down Oh it was her wing^* ' She had foigoltcn to 
take them off Faiiy wings become heavy wiien the> are wet She cjicd for 
holp as loudly as she could 

Draw a line under the word which tells how the fairy felt 

cross angry weary afraid j'ojdul 


Type B. To rRioiri the Ouicomc of Givlx Events 

This IS to be a reading test You arc to read a number of paiagraphs Below 
each paragiaph are foui sentences Each sentence tells what is most hkel> to 
follow after the happenings that are de'sciibcd in the paragraph You should 
draw a line under one - and only one — of these sentences to show- that you can 
tcU what will probably happen next Now, let us tiy a sample before we begm 
the real test Read this jjaragraph and then draw a hne under the one sentence 
which you think tells what will happen next 


The grocery man had a black cat lie loved his cat veiy miic h One day a 
lady bi ought a big bulldog into the store The grocei s cat raised his back and 
'said “Meow ' Psst ' * to the buhdog Of course the dog did not like that, so he 
grow It'd loucllv Before the grocciy man oi the lady knew what was happen- 
mg, the bulldog had spiuiig upon the cat. 

Thej" let the fight go on 
The cat slept on 
The lady took her bird away 
The grocery man saved his cat 


(Gates, 1926 By permission of the Bi rcau of Publications, Tcaclieis College, 
Columbia University ) 

later ages two stanclLiicl tests are common (^) woid-aiid-nuiuber- 
coinpciiisoii tests and [h) spcccl-oi-ictidiiig tests 

\Yord-and-numbci-coiupaiison tests are well illustrated b;y the 
Minnesota Test lor Cleiiral \Voikcis (1933), which is shown in pait 
in Ulus 8 Ihc score, which is the number of items compared, de- 
i:)cnds upon perceptual speed and lamiharitj with words and num- 
bers. 
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In speed-of-reading tests one is asked either to detect errors in a 
passage, or simply to read for a given time and to note the number of 
words which seem to have been comprehended The first is illustrated 
by the Chapman-Cook Speed-of-Reading, which consists of easy para- 
graphs of thirty words each. Near the end of each paragraph, one 
word spoils the meaning by its incongruity. The task is to cross out 
this word. For example. Chapman (1924) uses this sentence. 

It was such a cold, boisterous, and wintry day that eveiy person who was 
walking wore the thinnest clothes that he could find in his clothes closet 

This type of test has been produced by several authors, and Eurich 
(1931) followed the general idea when he composed a speed-of-reading 
test at the college level, using longer paragraphs and harder words. 
Criticism has been directed against this type of test since it does not 
resemble usual reading activities, but consists of disconnected pas- 
sages which do not allow usual rhythms, requires specific search for a 
single word rather than a comprehension of phrases, and requires the 
crossing out of a word. 

The other type of speed-of-reading test presents a passage for con- 
tinuous normal reading during a few minutes but usually fails to 
control comprehension in such a fashion that the scores are com- 
parable with one another Although directions commonly say to 
read slowly enough to understand what one reads, still some stu- 
dents interpret this to mean skimming and some, a detailed analysis. 
Tests of comprehension usually follow the reading of a passage, but 
these introduce factors of recall and possible distraction, and no 
satisfactory way of combining speed and comprehension scores has 
appeared. 

ILLUS. 69 SAMPLES FROM THE lOV^A SILENT READING TEST 
Advanced Examination, Grades 7-12 


A. 

Test 4. Sentence Oegaeization 
(Tune Allowance* 4 minutes) 

Dtredtons to the Pupil This test is given to see how well you are able to arrange groups of words into 
sentences. Work the exercises as shown in the sample. 

Samples (1) a wagon, (2) a boy, (3) had , 2 , j, x 

(1) how small, (2) see, (3) he is . x, j , 


1 (1) wanted, (2) to go home, (3) the boy 

2 (1) alwa 3 rs, (2) be rewarded, (3) good deeds, (4) should 

3 (1) as children, (2) they, (3) get stronger, (4) grow older 


ILLUS. 69. S.'VMPLES FROM THE IOWA SILENT READING TEST (Cant’d) 

B. 


Test 5. Paragraph Organization 


PART c. 

(Time allowance: 3 minutes) 

Directions to the Pupil: The following exercises are given to test your ability to arrange the sentences 

of an imorganized paragraph in their proper order. Work all the exercises as shown in the sample. 

Sample: (1) One man found that until he put toads in his greenhouse he could not keep 
insects from eating some of his flowers. 

(2) Sometimes men keep toads in their greenhouses. 2 , i 

The Cattle Tick 

1 (1) Once a territoiy has been made tick free, it is kept so by a strict quarantine against the 
introduction of infested animals. (2) The cotton states and the Federal government have 
made great progress in fighting this pest through the system of dipping cattle to rid them of 
the ticks. (3) Chief among the parasites of cattle is the Texas fever tick, which has caused 
enormous losses in the South. 


Early French Explorers in America 

2 (1) From this highly strategic post scores of explorers departed to become the pioneers of 
France in the new world before Boston and Philadelphia had been founded in the English 
settlements. (2) Quebec, founded by Champlain a year after Jamestown, is located on the 
St. Lawrence, eight hundred miles from the edge of the continent. (3) The early history of the 
Great Lakes region is the record of these French explorers. ^ 


c. 


Test S. Paragraph Organization 

PART B. OUTLINING 
(Time Allowance ; 3 minutes) 

Directions to the Pupil: The following are exercises to test your ability to organize an outline giving the 
most important items of a paragraph. Read the following paragraphs carefully. At the right of the 
paragraphs are outlines partially filled in. Fill in the blank spaces in the outline from your reading 
of the paragraphs by placing in the outline the numbers corresponding to the brackets and be sure to 
select the group of words in the different brackets which will result in a well organized outline. Be 
sure not to include more items in the outline than have been provided for. 


1. In the United States, which is the leading agricultural 

s 1 . V ^ 2 — 

qountry in the world, several causes have combined to encour- 

, s 3 . s 4 , 

age this industry. Of the.se factors, the more important are 

' 5 ' ' 6 ' ' 7 : 

the fertility of the soil, the variety of climate and other condi- 

. 8 ' ' 9 

tions of environment, the energy of the people, the encourage- 

.. 10 ' ' 

ment lent by the government to scientific agriculture, and the 

11 / V 

imrivaled transportation system for marketing crops. Land 

12 ‘ 

has been very cheap. High wages in other industries have led 

14 — / vw— 15—; ' '“—16 — ' 

to the invention of machinery by which one man can do the 

1 7 ^ ^ 1 s 

work of many. There is no country in the world where 

/ s_ — 19 . 

machinery is used so extensively in agriculture as in the United 

20 ^ 

States. 


Paragraph i. 

I. Factors encouraging 
American agriculture. 


A. 


B. 


C. 


D. 


E. 


F. 


G. 


(Jorgensen and Greene, 1927. By permission of the Bureau of Educational 
Research and Service, University of Iowa and the World Book Go.) 
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Two elaborate and expensive techniques for measuring perceptual 
span of words should be mentioned One technique, which uses a 
Metronoscope, presents words or phrases to a student at a given rate 
and asks him to record or tell what he has seen. The other technique 
photographs eye movement while one is reading. The first method 
does not resemble normal reading in certain respects, but it does 
give accurate records under the test conditions. It has been proposed 
as a means of training slow readers The second method usually fails 
to evaluate comprehension, but it does show precise records of num- 
ber, duration, and order of fixations on a line. 

A less expensive technique for training readers has been described 
by Dearborn and Anderson (1938). Printed material was photo- 
graphed on motion-picture film in such a way that when the film is 
projected, successive phrases and lines are exposed as they have been 
grouped by skilled readers The exposure tunes are limited to ap- 
proximately one fifth of a second, thus preventing more than an 
optimum number of fixations per phrase. The suddenness with 
which the units appear and disappear controls the duration of fixa- 
tion time. At the beginning of training the material is simple, the 
phrases short, and the exposure rate slow. As training progresses, the 
material increases in complexity and the projection rate is more 
rapid. Training of this sort has resulted in considerable improve- 
ment, as shown by the Minnesota Speed of Reading Tests and the 
Gates Silent Reading Tests on small samples of both college students 
and elementary school pupils 

Various groups have called attention during the last ten years to 
the fact that many publications intended for popular consumption 
were not widely read or understood, because they were difficult to 
read. Edgar Dale and Jeanne S. Chall (1948) have compared their 
own indices of readability with those proposed by Irving Lorge 
(1944) and Rudolf Flesch (1946), in an attempt to determine what 
factors contribute to reading ease or difficulty. Lorge published one 
of the first simple formulas, {a) number of different uncommon 
words; {b) average sentence length, and (c) relative number of preposi- 
tional phrases. Dale's list of 769 easy words were Lorge's common 
words 

Flesch used three criteria; average sentence length, proportion of 
affixed morphemes (prefixes, suffixes, and inflectional endings), and 
proportion of peisonal pronouns. He believed the count of affixes 
was a better indicator of abstraction and vocabulary load than 
Lorge’s uncommon words because the latter failed to discriminate 
relative difficulty above the eighth grade. 

Dale and Chall found the counting of affixed morphemes to 
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be laborious even aftci some training, and marked variations be- 
tween counts by difl-ereiit peisons appeared They also found a 
number ol haid-lo-read te\N and articles in which the incidence of 
personal pronouns was very high Dales li-^t ol appio\imateh three 
thousand casv w'oids made by ti\ing out ap]j>oxjmaLely ten thousand 
words on foiii i h-gradc pupils, was applied in making counts ol easy 
w’ords in the same sci les ol 37() standaul leading passages, the McCall- 
Ciabbs Standard Lessons in Reading, that weie used Loi ciiteria ol 
difficulty by Loige and Flescli Jnteicoi lelaiions ol these authors’ re- 
sults showed that the easy w'oids, using Dale’s 3 OOO-w'oid list, cor- 
related fiSS with the MfCall-Ciabbs Cl iieiia and 615 wuth Flesch’s 
“aliixcs count’' 'J’he affixes coiielated 793 with easy w'oids The 
easy-w’ord scores plus the average sentence length yiekled a multiple 
coirelatioii with the cnteiia ol 70 Using these tw’o measures, Dale 
and Chall have set up tentative grade lc\eh wdiich can be easily 
applied to samples from \aiious fields 

IVfEASUREMENT OE MATHEMATICAL ABILITY 

Tests 111 this gioiip range lioni the simplest counting operations to 
highly complex problem solving T.hcie arc tJnee types ol skills in- 
volved which appeal to be highly independent, at least among adults 
(1) calculation from io*e memory and rule, (2) abstract reasoning with 
iiLimbeis Ol letters, and (3) geometric reasoning witii spatial data 

Aiiihmetic Tests 

A pojmlar type of general aiithmetic test is called aiilhmetical 
leasoiiing oi pioblem solving (Ulus 70) Tcsls ol this soit recjuire the 
three laciors mentioned above in unknown combinations as v\cll as 
some language ability Ncailv all of the elementary tests contain 
many interesting practical applications liom busmesa, architecture, 
surveying, and house management Tests oi this geiieial sort are of 
little value lor a diagnosis ol basic skills, but they show’ the results ol 
complex thinking about common objects 

\ diagnostic test of mathematical ability is given in Ulus. 65A Hcic 
eight subdivisions aie hsied, and it appeals that James Brown is 
above average in Problems, Number Concepts, Symbols and Rules, 
and Niimbeis and Lq nations, and iieai or slightly below aveiage in 
Addition, Subtract ion, Afiiltiplication, and Division. Illustiation 
6 jB shows, with its detailed analysi:>, that his most difficult items in all 
eight divisions demanded skill in using fractions and decimals. He 
was able to handle simple and abstract numbcis faiily well 

The Progressive .Achievement Test and a number of similar tests 
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ILLUS 70 UNITED STATES ARMY ALPHA TEST 2 , 

TEST 2 

Get the answers to these examples as quickly as you can 
Use the side of this page to figure on if you need to 


{ 1 How many are 5 men and 10 men’ . Answer ( 15 ) 

2 If j ou walk 4 miles an hour for 3 hours, how far 

do>ouwalk’ . Answer ( 12 ) 

1 How many are 40 guns and 6 guns’ Answer ( ) 

2 If you save S6 a month for 5 months, how much will you 

save? . Answer ( ) 

3 If 32 men are divided mto squads of 8, how many squads will 

there be? . Answer ( ) 

4 Mike had 11 cigars He bought 3 more and then smoked 6 

How many cigars did he have left’ Answer ( ) 

5 A company advanced 6 miles and retreated 3 miles How far 

was It then from its first position’ . . . Answer ( ) 

6 How many hours w ill it take a truck to go 48 miles at the rate 

of 4 miles an hour’ Answer ( ) 

7 How many pencils can you buy for 40 cents at the rate of 2 

for 5 cents’ Answer ( ) 

8 A regiment marched 40 miles m five days The first day they 

marched 9 miles, the second day 6 miles, the third 10 miles, the 
fourth 9 miles How many miles did they march the last 
day’ . Answer ( ) 

9 If you buy 2 packages of tobacco at 8 cents each and a pipe for 
55 cents, how much change should you get from a two-dollar 
bill’ , , Answer ( ) 

10 If it takes 8 men 2 days to dig a 160-foot dram, how many men 

are needed to dig it in half a day’ . . Answer ( ) 

11 A dealer bought some mules for S900. He sold them for $1,000, 

making $25 on each mule How many mules were there’ Answer ( ) 

12 A rectangular bin holds 600 cubic feet of hme If the bm is 10 

feet wide and 5 feet deep, how long is it’ Answer ( ) 

13 A recruit spent one-eighth of his spare change for post cards 

and four times as much for a box of letter paper, and then had 
60 cents left How much money did he have at first’ Answer ( ) 

14 If 2 ) 4 , tons of hay cost $20, what will 4>^ tons cost’ . Answer i ) 

15 A ship has provisions to last her crew of 600 men 6 months. 

How long would it last 800 men’ . . . Answer ( ) 

16 If a train goes 200 yards in 10 seconds, how many feet does it 

go m a fifth of a second’ . . . . Answer ( ) 

17 A U-boat makes 10 miles an hour under water and 20 miles on 
the surface How long will it take to cross a 100-milc chann^, 
if xt has to go three-fifths of the way under water’ Answer ( ) 

18 If 214 squads of men are to dig 4,066 yards of trench, how 

many yards must be dug by each squad’ Answer ( ) 

19 A certain division contains 2,000 artillery, 15,000 mfantry, and 
1,000 cavalry If each branch is expanded proportionately 
until there are m all 19,800 men, how many will be added to the 
artillery’ . . Answer ( ) 

20 A commission house which had already supplied 1,897 barrels 
of apples to a cantonment delivered the remainder of its stock 
to 28 mess halls Of this remainder each mess hall received 47 
barrels What was the total number of barrels supplied? Answer ( ) 

(By permission of the C. H Stoelting Co , and Henry Holt & Company) 

are particularly useful for a rapid survey. The subdivisions are prob- 
ably too short for reliable testing, such as is needed for appraising 
experimental education. Very elaborate instruments are to be found 
in the Compass Diagnostic Tests, by Ruch et al (1925). Approxi- 
mately 90 distinct acts were listed, and a large number of separate 
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tests were printed One of these, Addition of Fractions, is shown in 
Ulus 71. Furthermore, a Standard Arithmetic W^ork Book was de- 
signed to furnish both original learning situations and remedial 
drills Such work books are now numerous and widely used in both 
traditional and j^rogressive schools 

ILLUS 71. SAMPLES FROM THE COMPASS DIAGNOSTIC TESTS 
IN ARITHMETIC 

Test V. Addition of Fractions and Mixed Numbers — Form A 

Part 1 — Changing Fractions to Equivalent Forms 

Directions • Change the form of each fraction below to use the denominator given. 
Study the samples carefully to see how this is to be done. 

Samples, i = i' — 'S’ 

i = -nr f^TD* i = T i ~ ts- iV = Tff 4- = tt 

i-Tw i-'w i = 7 f = tt 1 = w i = Ttr i = tt 

Part 4 — Fundamentals op Addition op Fractions 


Directions Fmd the sum for each example below Do your figurmg on a piece of 
scratch paper 




9 + f+llA+16f- 

5 A + i = 

9 

i + 8f -f = 

Three-tenths plus one-half = 

16 

f 

Add ^ and -jS: = 

(Ruch, Kmght, Greene, and 

Studebaker, 1925 

By permission of Scott, 


Forcsman and Go ) 

Another type of analysis is illustrated by the use of the Buswell- 
John (1925) Diagnostic Chart A trained observer watches a pupil 
work through a standard 8-page test and notes the types of errors 
as they are made Illustration 72 is of a check list of errors in addi- 
tion Similar lists are available for multiplication, division, and sub- 
traction Direct observation gives more insight into the sources of 
error than unobserved test scores, particularly if the student talks as 
he works. 

Algebra, Geometry, and Trigonometry Tests 

For both high school and college, general tests of algebra, geometry, 
and trigonometry have been well standardized, and methodical work 
books are available for the elementary courses Thorndike et al. 
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ILLUS, 72 TEACHER*S DIAGNOSTIC CHART FOR 
INDIVIDUAL DIFFICULTIES 

Fundamental Processes in Anthmetic 

Teacher’s Diagnosis 
for pupil 

Name School Grade Age_^ IQ_ 

Date of Diagnosis * Add Subt Mult , Div._ 

Teacher’s preliminary diagnosis 

AnnmoN . (Place a check before each habit observed in the pupil’s work) 


al 

Errors in combinations 

a2 

Counting 

a3 

Added earned number last 

a4 

Forgot to add carried number 

aS 

Repeated work after partly done 

a6 

Added carried number irregularly 

.. _ a7 

Wrote number to be carried 

a8 

Irregular procedure in column 

_ _ a9 

Gamed wrong number 

alO 

Grouped two or more numbers 

all 

Sphts numbers into parts 

al2 

Used wrong fundamental operation 

al3 

Lost place m column 

al4 

Depended on visualization 

al5 

Disregarded column position 

al6 

Omitted one or more digits 

_ al7 

Errors m reading numbers 

al8 

Dropped back one or more tens 

al9 

Derived unknown combination from familiar one 

a20 

Disregarded one column 

a21 

Error m wnting answer 

a22 

Skipped one or more decades 

a23 

Canymg when there was nothing to carry 

a24 

Used scratch paper 

a25 

Added in pairs, giving last sum as answer 


Added same digit in two columns 

a27 

Wrote earned number in answer 

a28 

Added same number twice 


Habits not listed above 


(G. T. Buswdl and Lenore John, 1925. By permission of the Public School 
Publishing Co., Bloomington, 111.) 
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(1923) wrote an elaborate Psychology of Algebra which listed four- 
teen algebraic abilities, suggested ways of eliminating unnecessary 
habits, and also provided or suggested drill on neglected skills 

Out of a large number of available algebra tests two will be men- 
tioned Lee (1930) designed a test to indicate ability to succeed in 
algebra. Its four parts are 

a Arithmetic problems where algebra might be helpful 

h. Number analogies, such as 3~9 5 — 30, 15, 20, 23 

c Number series* 4, 8, 10, 20, 22, 44 

ab 

d Easy formulas Gi\en a = 10, b = 8, to find A 

Lee reported correlations of 71 among 318 high school pupils be- 
tween this Algebraic Ability Test and an achievement test after one 
semester of algebra This correlation is nearly the same as that be- 
tween achievement tests at 9 and 18 weeks. The Algebraic Ability 
Test correlated .631 with algebra grades. These figures are considered 
to be about as high as the reliability of grades will permit. 

The commonest type of algebra achievement test is illustrated by 
the Columbia Research Bureau Algebra Test, which has two parts. 
In Part I skill is required in solving equations, such as: 

1 X+15 = 23 
14 4X^— 16X“-9 = 0 
18 square root of X — 2 = 6 

In Part II problems such as the following are to be solved: 

2. How long must I make a garden that is 8 feet wide so that it will have 
as large an area as my neighbor's, which is 16 feet long and 6 feet wide? 

24 The distance (S) a body falls in T seconds is expressed by the formula 
S = 16T2_|_ VT, in which V is the initial velocity downward How many 
seconds will it take a body to fall 640 feet if thrown downward with a velocity 
of 48 feet a second? (Use T for the unknown ) 

The scores in Part II involve specific information outside of the field 
of algebra, and some of the problems can be solved without algebra. 
Work IS now under way to prepare algebra tests which will place 
greater emphasis on utility and cultural aspects by the use of more 
challenging problems and the simplest techniques. Yielding to con- 
tinuous agitation for curricula based on immediate needs and inter- 
ests of students, many high schools have in recent years transformed 
the teaching of algebra from the techniques of solving polynomials 
to approaches to analytical thinking about practical situations, as 
tested in Part II. 

A good elementary geometiy achievement test is that of Schorling 
and Sanford (1926), part of which is shown in Ulus. 73. Part I is a 
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ILLUS. 73 SAMPLES FROM THE SCHORLING-S \NFORD 
ACHIEVEMENT TEST IN PLANE GEOMETRY 

Form B 

Directions for Part II 

Drawing Conclusions from Given Data 

In Part 11 of this test, you will be given geometric figures and certain facts 
relatmg to each Thmk w^hat fact you could prove about each figure on the basis 
of the given information and write a statement of this m the space beside the words 
“ We can prove that ” 

Be sure to use each piece of information that is given to you You will be 
allowed 12 minutes for Part II 

Example: 

Given AdBC with AB ^ AC. 

We can prove that 

[Smce ABC is given as an isosceles triangle, we 
can prove that ZB = ZC, so 

ZB = ZC 

should be written on the dotted line ] 

Directions for Part V 
Computation 

In Part V of this test, you will be given diagrams about which certain facts 
are known. You are to use these facts m computmg such thmgs as the length of 
a line, the size of an angle, etc , accordmg to the questions that are asked. Read 
each question, study the figure upon which the question is based, and write the 
answer m the space beside the question. 

You will be allowed 12 minutes for Part V. 

Example: 

In the rectangle ABCD, AB is 12 in , BC is 8 
in. 

What is the area of the 

rectangle? Ans 

IP6 sg in. should be wntten on the dotted 
hne] 

Now try this • 

What is its perimeter? Ans 

Be sure to express your results in numerical 
form 

(Schorling and Sanford, 1926. By permission of the Bureau of Pubhcations, 
Teachers CoUege, Columbia University ) 
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vocabulary test. Part II requires a statement of what can be proved 
from given data. Part III asks that the correctness of certain conclu- 
sions be judged Part IV requires an analysis of constructions. Part 
V demands the computation of distances or areas. 

Total scores on tests of this kind usually correlate approximately 
.75 with the midyear grade for a beginning course. They also show 
equivalent-form reliability of about 85 The subdivisions, which 
are doubtless less reliable, furnish valuable evidence of a pupiPs 
weak and strong points The intercorrelations of the subdivisions are 
low enough — ^usually about 50 — to suggest a number of independent 
skills, but no careful factor analysis of such skills has come to hand. 
A subjective analysis indicates that j^robably four independent fac- 
tors are at work: verbal information, reasoning, spatial imagery, and 
number relations 

Tests of advanced work in algebra and in plane and solid geometry 
are found in the Cooperative Test Senes These include items to ap- 
praise calculations and reasoning as emphasized in standard text- 
books. 

A good standard trigonometry test is the American Council on 
Education Trigonometry Test (1930), following five parts of which 
are to be completed in 72 minutes 

I Technical knowledge, as 

Sec O IS positive and tan O is negative, the quadrant of O is first, 
second, third, fourth 

11 Completion of equations: 

X=:b Y = 3; tan 0 = 

III. Indicate method of solving verbal problem. 

From the top of a cliff 350 feet high the angle of depression of a 
buoy is 16® 38'. Find the distance from the buoy to the top of the 
cliff 

IV Indicate method of solving diagrammed problems similar to Part III. 

V. This is like Part IV, but more difficult. 


SPECIAL STUDIES 

Achievements in special studies have been appraised with local 
tests more often tlian with standardized instruments This practice 
is probably due to the variation in objectives in different localities 
and the lack of well-standardized, acceptable tests. Most of the pub- 
lished tests cover only elementary aspects, since advanced activities 
are extremely complex and are carried on by only a few persons. 
Special tests of physical sciences, social studies, foreign languages, and 
business will be briefly discussed. 
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Physical Sciences 

Appraisals of the results of any science course have traditionally 
taken the form of questions about facts and natural laws. A good 
illustration is the Iowa High School Content Examination, part of 
which appears in Ulus. 74. It consists of fifty multiple-choice ques- 
tions on a variety of subjects A more diagnostic procedure is shown 
by tests of the Cooperative Test Service. This nonprofit organization, 
established by the American Council on Education in 1930, has pro- 
duced, through a large number of contributors, annual forms of 
reliable tests of English, foreign languages, mathematics, natural 
sciences, and social studies for high school and college levels The 
Cooperative Chemistry Test (1934), shown in Ulus, 12, is interesting 
because it attempts an analytical approach by the use of three parts. 
Part 1 requires knowledge of facts and natural laws, Part 2 requires 
knowledge of terminology, and Part 3 requires statements of probable 
outcomes and also explanations for these statements. Part 3 is prob- 
ably much more dependent upon reasoning activities than Parts 1 
and 2. Tests of the type of Part 3 are being more extensively used 
now than formerly because they are thought to appraise an impor- 
tant objective of the course, namely, insight into causal relationships. 
Buckingham and Lee (1936) found marked differences between tests 
of science facts and tests of ability to point out relations between 
laws and phenomena Some of their freshman group made good 
scores on the factual tests and very low scores on the tests which de- 
manded careful inferences. Their results, which are typical of several 
other reports, point to the need of measuring knowledge of facts 
separately from reasoning when careful analyses are desired (Simi- 
lar biology and physics tests at both high school and college levels 
are found in the Cooperative Test Series, the American Council on 
Education Senes, and a number of others.) Other objectives of a 
science course, such as skill in the laboratory, known scientific theo- 
ries, and practical applications, are difficult to appraise Wnghtstone 
(1936) has published an initial study of such appraisals m high 
school science courses. 

Social Sciences 

History, civics, and geography are the main social studies in high 
school, and nearly all the standard tests deal with these topics Social 
studies broaden into the humanities in college, including economics, 
sociology, philosophy, psychology, and political science. A compre- 
hensive volume by Kelley and Krey (1934) discusses tests of knowl- 
edge and attitudes in social studies. In this volume the report of Pres- 
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ILLUS 74. SAMPLES FROM THE rOW\ HIGH SCHOOL CO^ TEXT 
EX\MJ\ VTIOX 

Sedimt 3 

SCIENTJ* 

{Tiwc aVoi 'rd 10 vitnuLcs) 

1. Combustion is another name for (1) freezing (2) drying (3) boiling 
(4) burning (5) melting 

2 A gas which supports combustion l^ (1) h3'drogcn (2) nitrogen 

(3) carbon dioiade (4j OA^gen (5) carbon monoxide 

3. H2SO4 + BaCL = 2 HCl and (1) ILO (2) Ba(OH)2 (3) SO3 

(4) BaS04 (5) H2CIO, 

4 The freezmg point on the Cenligiade thermometer is (1) — 273° 

(2) 0° (3)32° (4)100*' (5)212° .... 

5 Substances which hasten a (hcimcal nction iMtlioiil LhemscUcs under- 

gomg any chemical change aie called (1) calalysts (21 electro- 
lytes (3) lonogens (1) allotrops (S) anhydrides 

6 The formula for hydrochlonc acid is (I) XaOll (2) HCl 

(3) HN08 (4)MgSO» (S)H2 SOi 

7 Hydrogen may be made by the action of li> drochloric acid on (1) zinc 

(2) sodium chloride (3) copper sulphate (4) sodium 

hydroxide (5) polasMum chloiate 

8 The length of a meter in inclicb is about (1) 12 (2) 27 

(3) 39 (4) 72 (5) 144 

9 An example of a chemical change is the (1) desiccation of spores 

(2) oxidation of sugar 0) dis'^ohing of salt (4) osmosis of 
glucose (5) exchange of g<ise-» in the lungs 

10 Barometers are used to measure (1) humidity (2) lainfall 

(3) air-pressure (4; gravity (5) electiicity 

11 The center of our universe is tlie (1) earth (2) moon (3) sun 

(4) Jupiter (5) Mara 

12 An mstrument dependmg on atmosphciic piessure for its operation is 

the (1) siphon (2) hy'Uiaulic press (3) thermorader 

(4) telephone (5) voltaic cell 

13 The force needed to raise a weight of 1200 pounds on a hydraulic press 

whose piston areas stand in the ratio of 1 6 is (1) 100 lbs 

(2) 200 lbs (3) 300 lbs (4) 600 lbs (5) 2400 lbs 

14 One organism which possesses antennae is the (1) eartliworm 

(2) starfish (3) hydra (4) amoeba (5) grasshoppei 

15 Electric charges can be detected b> niean<^ of (1) condenser 

(2) electroscope (3) dynamo (4) voltaic cell (5) spectro- 
scope 

(Ruch, Cleeton, and Stoddard 1925 By pemnission of the Bureau of 
Publications, UiuvcisiLv of Iowa ) 
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sey is of particular interest She first established the master list of 
terms used in history, shown in Ulus 60, and then constructed mul- 
tiple-choice Items for 346 words and applied these items to more 
than 11,000 pupils in the fourth, sixth, eighth, tenth, and twelfth 
grades. The proportions of students in each grade who passed seven 
of the items are shown in Ulus. 75 This figure shows that mastery 
of three of the items was fairly complete in the eighth grade and that 

ILLUS 75 GROWTH OF SOCIAL VOC\BULARy 
Per Cent 



Grades 

Per cents of students in grades IV to XII who succeeded on 
certain vocabulary items 

(After Pressey, 1934 By peimission of Charles Scribner’s Sons) 

the item for the word enact was ansxvered about as well by the fourth 
as by the tenth grade. The items for the words competition and 
domestic showed fairly regular increases, and the item for sedition 
proved to be difficult before the tenth grade. 

The seven items are shown in Ulus 76. An inspection of these items 
will show that some words are defined more completely than others. 
Certain items require difficult discriminations in word usage. The 
difficulty ot an item is here determined partly by difficulty of defining 
the key word, and partly by the context of the item In order to 
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ILLUS. 76. TEST OF CONCEPTS USED IN THE SOCIAL STUDIES 

1. To what does agriculture refer? {a) fishing (6) mining (c) farming 
{d) manufacturing 

2. How do laborers most often attempt to force an employer to raise their wages? 
{a) by going to war {b) ])y going to another factory (c) by going to a 
foreign country (d) by going on strike 

3. Which phrase refers to the entire business of making woolen clothing? {a) the 

woolen shops (£>) the woolen industry (c) the wool-growers (d) the 

sheep ranchers 

4. Which of the following words applies to the practice of lowering of prices by one 
company so as to get business away from another? (a) rebellion {h) emi- 
gration (c) output (d) competition 

5. Which are enacted? (a) verdicts (b) debts (c) laws (d) dispatches 

6. Which ^vord refers to the affairs relating to one’s own country? (a) foreign 
(b) international (c) domestic (d) diplomatic 

7. Which is a form of conspiracy against one’s country? (a) sedition (b) op- 
position (c) criticism (d) immigration 

(Pressey, 1934, p. 189. By permission of Charles Scribner’s Sons.) 

discover whether such multiple-choice items are indicative of one's 
ability to write a correct definition, Pressey made a comparison of 
scores on two vocabulary tests, using the same words. In one test the 
children checked multiple-choice items, such as those given in Ulus. 
76, and in the other, they wrote definitions for the key words. She 
found that approximately 70 per cent of the words correctly defined 
were also used correctly in the multiple-choice forms and that the 
correlation between total scores on the two tests was high. The form 
in which an item is cast undoubtedly affected the task to some de- 
gree, but the results from the two types of tests were similar. The 
relative advantages and the intercorrelations of various forms of test 
items have been discussed in Chapter IV. 

From studies such as this the effectiveness of various learning situa- 
tions may be compared for each item. Pressey concluded from her 
study (p. 174) that “not enough special vocabulary is acquired for 
the reading of the average textbooks used in these grades.” From 
such studies it is also possible to compose tests of items which have 
known levels of difficulty for particular populations. Kelley and Krey 
(1984) suggested that the best items for careful scale construction were 
those which, like 4 and 7 in Ulus. 76 and Ulus. 75, show regular 
growth increments. 

Standard high school tests of this sort are the Kelty-Moore (1984) 
Tests of Concepts in Social Studies, the annual American Council on 
Education Civics and Government Test, and the annual Cooperative 
Tests of American, Ancient, Modern European, English, and Medie- 
val History, of Economics, and of Contemporary Affairs. Terminol- 
ogy, descriptive information, and careful inferences are measured 



198 ACHIEVEMENT AND APTITUDE 

separately, and also added to give a general score. An important con- 
tribution to measurement in American history is a handbook on 
selected test items by Anderson and Lindquist (1949) 

A number of geography tests have been prepared by state testing 
bureaus. Typical of a general test is that of Torgensen (1933), which 
is a 64-item multiple-choice test for the sixth through the eighth 
grades, that requiies knowledge of the earth’s surface, tides, natural 
resources, populations, climates, and map reading 

A test of general skills basic to effective work in social sciences was 
published by Wrights tone (1936) The four parts, each of which con- 
sists of about twenty-five items are as follows 

1 Obtaining facts. This part measures a pupil’s ability to read 
charts, graphs, and maps and to use an index. 

2. Organizi7tg facts. This is a test of outlining materials and 
judging their value. 

3. Interpreting facts. This is a reading test in which statements 
are to be marked true or false according to a given paragraph. 

4. Applying generalizations In this part multiple choices are 
used, as in Ulus. 12, to explain events and support the explanation, 

A test similar in design and content was also published for natural 
sciences by the same author General skills tests undoubtedly tap the 
same skills that have been described under language and mathematics, 
but the combination of skills is probably original. 

Two interesting tests for adults in the field of social adjustment 
have recently been published Horrocks and Troyer (1946) have 
brought out a test of diagnosis and treatment of social, emotional, 
physical, economic, intellectual, and academic problems of high 
school students, called Tests of Human Growth and Development. 
Three case histories are used and each history is divided into three 
parts of about three hundred words in length. After reading each 
part of a case history, the person being tested answers about twenty- 
five multiple-choice questions on diagnosis, and about twenty on 
remedial measures. Five choices ranging from true through possibly 
true, no evidence, possibly false, and false are to be used in answer- 
ing diagnostic questions, such as “For a girl of her age, Connie is 
more than usually self-conscious and sensitive.” 

Five choices — strongly agree, agree but with reservations, un- 
decided, disagree but with reservations, and strongly disagree — ^are 
to be used in answering such questions as “Connie needs to increase 
her social participation.” 

From 50 to 90 minutes are needed for each case. There are no time 
limits. Scores are secured by comparing one’s answers with a key 
which represents the opinions of ten experts. Since there are few 
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right and wrong answers, a weighted scoring system is used, giving 
less credit for the less adequate answers. The reliability correlations 
are about .70, and the correlations between cases were .39, 55, and 
62. These tests, although far from perfect, show that a standard test 
in diagnosis and treatment is possible, and that the results will help 
to clarify language and concepts in this field, as well as to appraise 
ability more accurately than is usually done 

The same authors have produced an 80-item test— Test of Knowl- 
edge of Fact and Principle in Human Growth and Development — 
which includes a wide variety of topics, for example, the function 
of endocrine glands, sex differences, parent-child relationships, men- 
tal and physical growth and standards, philosophies of life, person- 
ality structures and defense mechanisms, and remedial practices. The 
reliability is reported to be .91 among college students. The correla- 
tions with the case-study scores range from 24 to .49 (median about 
.30) Gentile norms are given for samples of from three hundred to 
eight hundred college students. 

Another worth-while test in this field is reported by Helen Nahm 
(1948), who lists forty-three mental-hygiene principles, then sixty- 
six items in a Mental Hygiene Test for Nurses. The principles are 
illustrated by the following: 

Maladjustment is usually of multiple causation rather than of one cause 
alone 

When individuals are frustrated in their attempts to satisfy basic per- 
sonality needs, they may compensate by substituting other attainable goals, 
which may or may not be desirable ones Certain types of behavior may be 
tension-reducing for the individual, but not acceptable to society 

An Item from the Mental Hygiene Test is: 

Betty L has average ability but is failing in most of her courses She seems 
restless and nervous and appears to be uninterested in anything except 
having a good time. Which of the following methods would probably be 
most effective in helping her to improve? 

a. Advising her to spend more time in study 
h Restricting her social activities until her grades improve. 
c Trying to help her to find a solution for some of her unsolved 
problems 

Nahm also presented an 80-item test on autocratic-democratic 
practices This is an attitude test in which one is asked to indicate 
degree of agreement with such statements as. "Instructors and super- 
visors should encourage students to intelligently criticize accepted 
hospital routines and procedures.” Gentile norms for twelve schools 
of nursing show wide variations. 
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An appraisal of Activities in Social Studies, shown by the check 
list in Ulus. 77, was used by Wrightstone (1936) It divides activities 
into three groups self^initiated, cooperative, and recitational In 
his comparison of traditional and progressive schools Wrightstone 
found that pupils in the latter showed much more self-initiated activ- 
ity than those m the traditional schools This check list is a significant 
instrument because it draws the attention of instructor and pupil to 
activities which aie probably of much greater social importance than 
the acquisition of information in a limited field 

ILLUS. 77 CHECK LIST OF ACTIVITIES IN SOCIAL STUDIES 

A Self-Initiated Activities 

1 Bnngmg voluntanly contributions (clippings, exhibits, books, charts, etc ) 
for school activities 

2 Submitting voluntanly and oraUy data, or information, gained outside 
school (observation, tnps to buildings, places, travel, museum, radio, lec- 
tures, movies, mdependent readmg, etc ) 

3 Presentmg an organized wntten report showmg research or investigation 
by pupil 

4 Volunteering as leader or special worker on project or task 

5 Suggestmg methods, materials, activities, etc , for developmg a project or 
problem 

6 Defending a pomt of view in which the pupil beheves 

B Co-operative Activities 

1 Ciiticizmg (praismg or challenging) a contnbution 

2 Askmg chairman or teacher pertment subject matter questions which relate 
to the theme or topic of the group discussion, excludmg routme class manage- 
ment questions 

3 Offering objects (books, chairs, pencils, etc ) to teacher, pupil, or visitors 

4 Respondmg quickly to requests for qmet, matenal, etc. 

C REaTATIONAL ACTIVITIES 

1 Responding to questions on assigned textbook or subject matter 

(Wrightstone, 1936, p. 134. By permission of the Bureau of Publications^ 
Teachers College, Columbia University.) 


Foreign Languages 

Carefully standardized tests are available for some of the skills 
taught in beginning courses in foreign languages. The Cooperative 
French Test (1934) which is in three parts, is a good illustration of 
such a test 

I Consists of 80 multiple-choice items where a short statement in French 
IS to be read and completed sensibly thus 

1 Le mari de ma grand’m^re est mon — grand-p^re, cousin, p^re, oncle. 
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II. Here are 100 vocabulary items where a key word is to be matched 
with Its closest synonym 

1 retourner, 1 rentrer, 2 finir, 3 toucher, 4 recevoir. 5 tomber. 

Ill Here 100 grammar items appear in which the correct form is to be 
supplied from five choices 

Sample 

1- le I have a pen J'ai plume (3) 

2. la 

3. une Here is the book. Voici livre. (1) 

4 un 

5. no additional word She is in France File est en — France, (5) 

French teachers often criticize this sort of test. They believe that 

1 The material is too fragmentary. 

2 The material is uninteresting 

3 Some vocabulary items can be correctly answered by noting similarities 
m word forms, without knowledge of French. 

4. The student is not asked to compose any French 

5 The test situation is veiy^ different from normal reading situations 

6. The grammar section emphasizes proofreading ability which shows 
little relationship to skill in correct reading or speaking in real life situations 

7. There is no check on spelling or use of accents 

For survey purposes, however, these tests are usually recognized as 
valid measures of general vocabulary, proficiency in comprehension 
of short passages, and the commonest French syntax. The Cooperative 
Test Service has developed similar tests for achievement in German 
and Spanish Elaborate tests of the same sort are to be found in the 
College Entrance Board Examinations and in the several state ex- 
amination programs 

Business Achievement 

Tests in this field may be conveniently placed in two groups, cleri- 
cal skills and business information The former are fairly well devel- 
oped, whereas the latter are in a more experimental stage. 

Clerical Tests. In 1922 Thurstone published an examination in 
typing in three parts. The first part presents a typewritten page upon 
which numerous corrections had been made with pen and ink and 
asks that a correct copy be made. The second part requires 40 hand- 
written Items to be typed in columns on a blank. The third contains 
48 words among which misspellings are to be detected. Separate scores 
for errors and speed are to be recorded and also combined by the addi- 
tion of total scores. The skillful typists made the lowest scores. 

Blackstone (1923) standardized tests of typing in which business 
letters in correct typewritten form were presented to be copied during 
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a 3-minute period Norms for speed and accuracy of typing were 
given separately and also combined into a score thus 

strokes-per-minute X 10 

score = PTt; 

errors + 10 

The scores were found to have a repeat reliability of ,93 for a group 
of pupils with 20 months of instruction. The average pupil with 5 
months of instruction secured 88 points, those with 20 months, 206 
points, and with 30 months, 236 points 

Shorthand tests have been standardized in connection witli courses 
of instruction Bisbee (1933) prepared a test series which is typical. 
Test 1 measures skill in outlining approximately 122 different Gregg 
principles, 60 phrases, and 70 brief forms. In Test 2 two letters are 
dictated at different speeds. From this material errors in outlining 
were scored Test 3 requires correct English spelling of 42 hard words. 
Transcription of Tests 1 and 2 is also required after their outlines 
have been scored. The typewritten sheets are scored tor errors in 
transcription, spelling, and punctuation. 

A number of tests have recently been standardized and issued, 
which illustrate a combination of achievement and closely related 
aptitude measures Typical of tliese is the Psychological Corporation 
General Clerical Test (1944), which gives an over-all score to rep- 
resent general clerical aptitude, three section scores (for comparison 
and filing, for mathematics, and for verbal skills), and nine separate 
test scores The marked resemblance between this test and a stand- 
ard achievement test is shown in Ulus 57. The same basic language 
and number skills are tested. The achievement tests give more em- 
phasis to special subjects taught in school, while the clerical tests give 
more emphasis to speed in comparison of words and numbers, and 
filing or sorting. Other clerical tests are given in Appendix II 

One of the most complete and well-standardized series of clerical 
tests is issued, usually in annual form, by the Joint Committee on 
Tests of the United Business Education Association and the National 
Office Management Association These cover the following subjects: 


Test 
Bookkeeping 
Business Fundamentals 
Business Information 
Filing 

Stenography: 

Dictation 

Transcription 

Typing 

Machine Calculation 


Time, tn Minutes 

m 

35 

25 

120 

SO 

120 

120 

120 
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Another complete series, the United States Armed Forces Institute 
examinations, which is issued by the Cooperative Test Service and 
the Science Research Associates, includes: 


Test Time, tn Minutes 

Bookkeeping and Accounting* first and second years 180 

Business Arithmetic 135 

Business English 120 

Commercial Correspondence 120 

Gregg Shorthand Phonograph record 120 

Typewriting* first and second years 60 


One of the best recorded tests for stenography is that of Seashore 
and Bennett, which consists of standard phonogi'aph records contain- 
ing five letters dictated at diflEerent rates. Two are short and slowly 
given, two are of medium length and are given at average speed; 
and one is long and given rapidly. Alternative forms are provided. 
About 16 minutes are needed for dictation and 30 minutes for tran- 
scription. 

A different type of stenography test is that of Blackstone and 
McLaughlin (1932), which has seven parts. The first six parts are 
short examinations of English techniques and information about 
business practices, the seventh, a transcription test: 

1 Grammar, spelling, and punctuation errors were to be detected (30 
items, 8 mm ). 

2. Correct syllabification was to be recognized from 4 choices (20 items, 
4 mm ) 

3. Office practices. This is an information test with a few items concerning 
the right thing to do (20 items, 5 mm ) Thus: 

a Biographical sketches of men who have been successful in the arts and 
sciences may be found in. 

1 The Woi Id Almanac, 2 The Statesman's Yearbook, 

3. it G Dun's Reports, 4 Who*s Who ( ) 

b On the first day on a new job, a girl is called by her first name by one 
of the men working with her She does not like this She should 

1. Tell him that she prefers to be called by her last name, 2 Show 
her disapproval by a cold manner, 3 Call him by his first name, 4 
Report him to her chief ( ) 

4. The alphabetical filing of 20 names among 28 already listed alphabeti- 
cally (6 mm.) 

5. Abbreviations of 20 terms are to be written, such as. Certified Public 
Accountant (C.PA), and the unabbreviated expressions are to be sup- 
plied for 20 abbreviations, such as, cr (credit), mgr. (manager) (5 mm ). 

6. Business organization In this test, twenty types of information are 
listed, and twelve department names The task is to indicate the depart- 
ment which should be consulted for each type of information (5 min ), 
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7 Transcription Here short letters are read by the examiner and 12 
minutes are allowed for typing The manual provides seven letters arranged 
m order of difficulty or complexity, and prints the standard speed of read- 
ing the short passages in each letter The more complex letters are to be 
read faster, ranging to 120 words per minute The hardest transcriptions 
are allowed five times as much credit as the easier The examiner is told 
to choose the letters which she thinks the class as a whole can transcribe 
“completely and correctly “ 

Norms for these seven tests are available for a thousand pupils 
distributed among four semesters of high school work, and for a 
large sample of stenographers The stenographers and the fourth- 
semester pupils had the same total scores. The latter were 12 points 
ahead of the former in transcription, but the stenographers showed 
more skill in tests 2, 3, and 5. The equivalent-form reliability for the 
total score was reported to be .88 and correlations between scores 
and efficiency ratings were 62 and .79 for small groups in service 
^ This test is interesting because it appraises more than transcrip- 
tion ability Tests 1, 2, and 5 are typical of English usage in general. 
Tests 3 and 6 deal with office background, and Test 4 requires filing 
activity, all of which can be done by persons without any stenographic 
training. Such items have been included in tests of general clerical 
work. One of the first of these was developed by Thurstone (1922). 
It consists of eight parts: 

1, Checking addition and subtraction of numbers 

2 Detecting misspelled words in a long passage 

3 Drawing a line through X, Z, U, and C, but no other letters, in pied 
type 

4 Associating letters with numbers 

5 Classifying men's names according to location in cities and also alpha- 
betically 

6 Classifying insurance items according to amount and date simultane- 
ously 

7. Easy arithmetic problems 
8 Matching proverbs 

Since many of the skills needed in this test may be well developed 
before one starts formal clerical training or employment, the Thur- 
stone Clerical Test has often been used as an aptitude test to predict 
success. Information concerning the use and validity of tests of this 
type should be sought in technical works on industrial psychology. 

Bookkeeping Tests, An elaborate set of four bookkeeping tests 
at high school level was prepared by Breidenbaugh (1940). A Work 
Sheet, Balance Sheet, Profit and Loss Statement, and Closing Entries 
are to be filled in correctly from figures that are furnished. Knowl- 
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edge of terms and business practices is also appraised by separate tests, 
which were constructed to cover the basic practices given in several 
textbooks Their reliabilities are not given, nor is tliere any indica- 
tion of prediction of success in bookkeeping work The tests are de- 
signed as achievement examinations in a couise oi instruction 
A more general test of bookkeeping is that by Elwell-Fowlkes 
(1928) Without making actual calculations, one is required to give 
information about general theory, journalizing, classification, ad- 
justing entries, closing the ledger, and statements Fifty minutes ai'e 
allowed to complete ninety items Two forms are available which 
correlated ,82 among 258 students m half-year classes. 

Business Information A general test ol business information was 
published by Thurstone (1921) and a similar but more detailed test 
by Thompson (1937) In the latter, 220 items, which are to be com- 
pleted in 80 minutes, are classified as follows 


arithmetic 

17 

organization and ownership 

30 

communication 

26 

economics 

17 

money and banking 

30 

selling and advertising 

19 

purchasing 

11 

investment and insurance 

41 

record filing 

16 

travel 

13 


Norms for 790 high school students are given for two equated forms. 
Total scores on the two forms were found to correlate approximately 
.94 for three groups of about one hundred students The subtests are 
probably too short to be used for diagnoses of separate skills, al- 
though they do furnish a rough basis for quantitative analyses. 

Tests for Professional Aptitudes 

A fairly large number of tests of professional and technical apti- 
tudes have been developed during the last few years, most of which 
are closely associated with the results of prerequisite courses Some 
of these are given in Appendix II. They may be roughly grouped 
into two classes, those that are principally subject matter examina- 
tions and those that include general thinking, reading, and problem 
solving. Among the first class are the Graduate Record Examinations 
of the Carnegie Endowment for Advancement of Teaching. These 
do not follow the curriculum of any school, but are designed to 
cover broadly eight principal areas of liberal education, mathematics, 
physics, chemistry, biology, social studies (history, government, and 
economics), literature, fine arts, and general vocabulary. 

Norms published each year are similar to those shown in Ulus. 78, 
which shows the scores for 1946 There the mean scores for groups 
specializing in fifteen different fields of graduate study are given. 
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« 

The expected mean profiles appear For instance, those studying 
biological science have high scores in fields where those studying 
fine arts have low scores, and vice versa. The fact that the literature 
and arts majors excel on the general vocabulai'y test indicates that it 
emphasizes literary, art, and musical terms. 

Other tests of this same sort are the Cooperative Tests, and the 
College Entrance Board tests in physics, chemistry, and general sci- 
ence, at both the college and the high school level They are useful 
for selecting technicians of less than college graduate level. A group 
of well-planned tests in the field of engineering is issued by the 
United States Armed Forces Institute, which includes at the college 
junior or senior level, electronics, drawing, mechanics, machine de- 
sign, strength of materials, surveying, and radio and Diesel engineer- 
ing 

Tests which evaluate both technical subject matter and general 
aptitudes are well illustrated by teaching and nursing aptitude tests, 
which sometimes also appraise interest or attitude Recently the 
Educational Testing Service (ETS) has developed and tried out a bat- 
tery of tests for law school and medical college admission. The latter 
is an 8-hour test which consists of the following. 


General Ability 

1. Verbal ability (vocabulary and comprehension) 


Scientific materials I 


Social materials 1 

105 min. 

Humanistic materialsj 


2. Quantitative ability 

60 mm. 

Achievement 


1 Understanding of modern society 

90 mm 

2. Premedical science 

90 mm 


345 mm 


In Ulus 79 there are some well-developed sample test questions from 
this battery. 


ANALYSES OF RESULTS 

Since the raw scores of achievement tests do not show the signifi- 
cance of the finding, several methods of interpreting scores have 
been devised One group of investigators have used mean scores of age 
or grade groups to construct scales Illustration 133, shows a profile 
which gives both educational-age (EA) and chronological-age equiva- 
lents and mean scores for each tenth of a grade from grades 2 6 to 
10 0 . 

From this figure it appears that Eleanor Brown’s highest score 
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ILLUS. 79 MEDICAL COLLEGE ADMISSION TEST 

SAMPLF TEST QUESTIONS 

The following sample test questions are intended to familiarize you with the 
mam types of questions used m the Medical College Admission Test and with 
the manner in which the answeis are to be recorded on the special answer sheets. 
When you have tried these questions, check youi answers against the list of correct 
answers on page 14. 

• VOCABULARY 

Sample Dnecttons 

In answering the questions in this test, decide which of the fi\e suggested answers 
has most nearly the same meaning as the capitalized word Then, on the answer 
sheet, blacken the space beneath the number corresponding to that of the word )ou 
have selected 


Sample Qiiestions 


1 CARCINOMA 

1 — carcass 
2 , — cancer 

3 — calcification 

4 — infection 

5 — excretion 


2. MORES 

1 — knowledge 

2 — laws 

3 — thoughts 

4 — customs 

5 — supeistitions 


3 AUDACIOUS 

1 — splendid 

2 — ^loquacious 

3 — cautious 

4 — auspicious 

5 — presumptuous 


(In Question 1, suggested answer 2 is the “best” answ^er Therefoie, space 2 on 
the sample answer sheet has been blackened ) 


COMPREHENSION 

Sample Directions 

This test includes reading passages, each of which is follow^ed by several questions 
based upon its content Read each passage carefully and then answer the questions 
following it by selecting the best choice for each question and blackening the space 
beneath the corresponding number on the ansiver sheet. 

Sample Passage 

The term albedo is used to indicate the reflecting power of an object Technically 
defined, albedo is the latio of the radiation reflected from an object to the total 
amount incident upon it For example, the albedo of the moon is 0 073, which 
means that the moon leflects that fraction of the sunlight which is incident upon it 
The value of the albedo of a planet is a measure of the quantity of atmosphere 
which surrounds the object The higher the albedo the thicker the atmospheric 
layer In the case of objects without atmosphere, as in the case of the moon, the 
albedo combined wuth the color of the reflected light may be used to make estimates 
of the chaiactei of the material making up the surface of the object 

Sample Questions on Passage 

4 Judging from this passage, what may we infer regarding the albedo of the Earth? 

1 — ^That It IS greater than 0 073 

2 — That It IS smaller than 0 073 

3 — ^That It IS approximately the same as that of other planets 
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4 — ^That it is greater than the albedos of other planets which are farther fron 

the sun 

5 — ^\Ve cannot infer anything about the Earth's albedo 

5 When the albedo is calculated, to which of the following is the value of 1 as 
signed? 

1— The amount of light reflected from the object ' 

2 — ^The amount of light the object lecen'es 

3 — ^The total amount of light given off by the sun 

4 — ^The total amount of light received by the Eaith 

5 — ^The average amount of light received by the planets 


QUANTITATIVE ABILITY 

Sample Directions 

In this test solve each problem and then indicate the one coriect answer for eaci 
in the pioper space on the answer sheet. 

Sample Qiiestions 

fl. It IS known that every circle has an equation of the form 
Ax2 + Ay2 + Bx + Cy + D = O 
Which of the following is the equation of a circle? 

(A) 2x — 3y = 6 (B) x 2 — ys + 4x— '2y + 3 = O 

(C) 3x2 + 3y2--2x + 6y + 1 = O (D) 2x2 + 3y2 + 6x + 4y + 1 = O 

(E) None of the above 


Questions 7-8 

The tabulation below shows the frequency with which 600 employees of a certain 
industrial plant met with accidents during a single year. 


Number of 
Accidents 
per Worker 


0 

1 

2 

3 

4 

5 

6 
7 


Number of 
Workers 
490 
76 
23 
6 
3 
1 
0 
1 


7. What percentage of the workers had 4 or more accidents during the year? 

(A) 010 (B) 0 25 (G) 0 50 (D) 0 83 (E)5 00 

8. What IS the probability that one of the workers picked at random from the 
group will have had more than one accident during the year? 

(A) 0.023 (B) 0 034 (C) 0 046 (D) 0 057 (E) 0 200 
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ILLUS 7Q ^rED^CAI- COLI.rOE \DM[SSIO\ TIST (Conl’il) 
UiNDbRSTAM)l\G OJ' MOOhRX SOCIETY 
Sample Diiecliom 

tach incomplete suitcincnr m this test is followed liy five words, phia^'Cs, or 
clauses, (}}ie ol wlndi will complete the '‘Latemciit coiicfih Sclcft tlic collect com- 
pletion anrl lilackcii the sp.itc beneaili the concspoiiclmg number on ilie appiopii- 
atc line on the answci sheet 

Sample Quci lions 

9 The lei m “wot I d povvci” lefcrs to a n.iiion 

1 — whose piodiKis aie commonl) used ihioiigboiil the woild 

2 — which maintains diplomatic agents in all the rcrogni/cd ‘•overcign nations 

3 — which has a la'ge population 

4 — which has siiflrciem wealth and organ i/a Lion to cvcit a si long mlluencc in 

woild poJltlc^ 

5 — so sLiong in a niiliiaiv and an economic sense that it can demand icpicscn- 

fation al a gencial iniei national assembly 

10 Japan toda) piescnts no immediate ihicai to peace in the Far Last prnicip.illy 
because 

1 -so much ot the coiiniiv has been devastated 
2' she has been stripped ot hei colonies and conquests 

3 — the present Japanese coiisiiiulion outlaws wai 

4 — ibc new* Japanese govcinmeut is much opposed to the militaiy patty 

5 — Llicie IS now tinitv of piiipoNC among the vaiious inteicsls in the Fai East 

PRrMEDICAT, SCirXCE 

Sample Diredtom 

1-nch of the qiiesfions in this lest is followed by five suggested answeis \ou aic to 
select the best answer foi each question and iiiaik the coiiesponding space on the 
answci sheet 

Sample (luvslions 

11 When hoth the piessmc and the absedufe tempciatuic of a sample of “pcifecl” 
gas aie doubled, the \olumc ot the gas is iiuiltiplied by 

1—1 

2—2 

3— 1 

4— 8 
7—16 

12 Which one of the following is 75 per cent carbon, by weight, and 2i per cent 
hvclrogcn, bv weight^ 

1— C,H 

2— ( U 

3— CH, 

4— QH, 

5— CH, 

(B) permission of the Educational Testing Service) 
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(100 in Reading: Word Meaning) is equivalent to an EA of fifteen 
years, eight months, and to the mean of grade 9 7 Her lowest score 
(76 in dictation — in this case a spelling test) is equivalent to an EA 
of twelve years, and to a grade standing of 6 2 The educational ages 
above fifteen years and below seven years were not based upon actual 
measurement but upon the extrapolation ^ of a growth curve which 
was found to fit the scores for the intermediate ages. 

Educational ages and grade equivalents are valuable for placing or 
piomoting a pupil in schools where promotions are based upon 
achievement. They do not, however, furnish an indication of where 
a person is in his own age and grade groups, with whose members 
he must compete. This information is furnished by centile scores 
(IIlus 65A) Here the proportion of pupils that exceed a given pupil 
can be quickly read A chart which combines both centile and grade 
norms for a single test is shown in Ulus 123. From this table it ap- 
pears that about 13 per cent of students in the grades below college 
were retarded one grade, 10 per cent, two grades; and 8 per cent, 
three grades Similarly, approximately 14 per cent were advanced one 
grade, 10 per cent, two grades, and 8 per cent, three grades Figures 
of this sort vary according to the method of grouping students which 
happens to be in effect in a particular school The spreads of age and 
grade scores in two city schools are contrasted in Ulus. 80 Although 
the data refer to intelligence test scores, similar results are found 
when general educational achievement tests are used The medians 
and the dispersions are found to be nearly the same for similar age 
groups, but in the grade groups. City B has higher means and greater 
dispersions than City A. Pressey (1933) commented on this situation 
as follows: 

The two school systems have, therefore, created a situation educationally 
different out of almost identical intellectual material. Further study showed 
the grade differences to be due primarily to different promotion policies in 
the two towns. School System B retarded its children to such an extent that 
the average chronological age per grade was about five months above 
that for System A. This excessive retardation so discouraged a great many 
of the children that they dropped out of school as soon as possible and 
therefore never reached the upper grades School System A, on the other 
hand, pursued a liberal promotion policy and tended to promote every 
child every year As a result, the average age per grade was less than in 
School System B, and the children were so encouraged that more of the 
duller ones continued into the upper grades, even those who dropped out 
of school at the legal age were in a higher grade tlian the similar class of 

1 Extrapolation is the continuation of a cur\»e into an unknoivn area by a 
formula computed from the curve in a known area. 
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children in System B. These contrasting promotion policies caused System 
B to require a higher standard of ability for entrance to each of the grades 
than did System A. The higher standard in the foimer may appear desirable 
until one remembers the large number of unhappy, retarded children who 
never continued beyond the middle giades of elementary school. A hea\y 
retardation will undoubtedly raise the grade averages, not only in intelli- 
gence but also in achievement, but at the expense of the best possible indi- 
vidual development of the children 

In making comparisons of communities and schools, one must always be 
alert to the differences that are caused, not by the innate ability of the 
pupils but by the artificial grade grouping resulting from the particular 
promotion policies adopted by school officials. Unless one knows the chrono- 
logical age of children m the same grades in different schools or school 
systems, he is quite unable to make any valid comparisons as to the mean- 
ing of the test scores 

ILLUS 80 INTELLIGENCE OF SCHOOL CHILDREN IN 
TWO SCHOOL SYSTEMS 

Score 
ISO 

160 

140 

120 

100 

SO 

60 

40 

20 

0 

S 9 10 11 12 13 14 15 3 4 5 6 7 9 

Years of Age Orades 

Q ^ Median lOth and 90th Gentiles 

City B + -f* -4- + 4 * 

(By permission of Pressey, 1933, and Harper & Bros ) 

Accomplishment Quotients 

One of the cardinal objectives listed by nearly all authorities is 
the development of effectiveness in learning Since one’s effectiveness 
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depends upon both his native ability and acquired skills, it was 
thought that relative effectiveness would be indicated by comparing 
them. One method of comparison used the ratio of two test scores 
a general intelligence test, and an educational achievement test 
Franzen (1922) divided the educational age by the mental age and 
secured an accomplishment quotient (AQ). Such quotients have prac- 
tically gone out of use, however, for the following reasons: 

1 It was shown by Rand (1925) that mental ages usually have larger dis- 
persions than educational ages for the same group of persons 

2 Logically, the correlation between AQ's and IQ’s would always be 
zero unless the correlation between IQ and EQ (Educational Quotient) is 
1 00 This correlation is rarely or never found because the two types of 
tests emphasize different skills. 

3 Marked difficulties have been experienced in standardizing both edu- 
cational and mental ages, particularly above the average adult level. 

4. The usefulness of the ratio has been questioned, for the concept of 
general intelligence as a unitary element has been attacked and, in the 
minds of many authorities, replaced by more descriptive items. 

5 Intelligence quotients can rarely be taken as good indicators of native 
abilities. 

Nevertheless, the problem of evaluating relative effectiveness is still 
present and still urgent. It will be solved only by longitudinal studies 
of development under controlled conditions. 


ILLUS 81 CHANGES WITH AGE OF CORRELATIONS BETWEEN 
ARITHMETIC AND LANGUAGE ACHIEVEMENT TESTS 


Author 



Average 

N 

Year or Grade 

Correlatton 

Thorndike (1926) 

126 

5th grade 

.52 

SchiUer (1934) 

186 boys 

3rd, 4th grades 

.63 

Asdi (1936) 

206 girls 

3rd, 4th grades 

.60 

♦79 boys 

9yr 

68 


79 boys 

12 yr. 

.43 


♦82 girls 

9yr. 

.63 


82 girls 

12 yr. 

30 

Garett, Bxyan, Perl (1935) 

306 bo3rs 

9 yr. 

52 

Vocabulary and Arith- 

96 boyB 

12 yr. 

.61 

metic 

102 boys 

15 yr. 

37 


340 girls 

9 yr. 

40 


100 girls 

12 yr. 

55 


123 girls 

15 yr 

.55 

Buckingham (1937) 

105 pupils 

9th grade 

.38 

Co-operative Algebra, 


and Gates Reading 




Garett (1928) 

338 men 

Istyr college 

.21 

Schneck (1929) 

210 men 

2ndyr college 

,14 

* The same children retf^ted 3 

years later with the same 

1 tests 
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Correlations between English and Aiitlimetic 

Because English and anthmeiic ha\c long been considered basic 
subjeris, their relationships have been studied b} a number of authors 
Some of thcir studies, which are suniinan/ed in Jllus 81, show a 
maiked tendency for louver correlations in older groups These cor- 
relations aic not strictl) comparable, because the types of patterns 
measuicd at various ages die only rouglilv alike Por instance, the 
Arithmetic Reasoning lest requires a laiger vocabuLuy and more 
coinidex r\pcs ol calculation at the twelfth year than at the ninth 
year The skills leading to success at the ninth \eai may be inade- 
quate Ol even detrimental at tlic twellih year 

These concLitions are also not stiictly comparable because it is not 
known how much they ue affected by a nanow selection of students 
College groups aie often selected from the highest third of the popu- 
lation — a piocess which i educes coiielations considciably below what 
they ^\ould have been if the total adult population had been sampled. 

APPLICATIONS OF TESTS 

Educational achievement tests have three main uses individual 
diagnoses, predictions of individual success, and evaluations of the 
eflccts ol instruction There is a wide use ol tests lor diagnoses ol in- 
dividual differences among normal, retaided and genius groups, 
deJinciuents and those with special defects or behavior problems 
Tllustiations 62, 65A, 132, 133, show diagnostic profiles, tiom such 
piofiles It IS possible to point out a peison's strong and weak points 
and to suggest remedial plans This is the work ol educational and 
clinical psychologists, about which one should consult special texts 

Scholastic Predictions 

Prediction of school success is especially important for selecting 
and counseling students Among the causes of vaiiations in making 
picchctions, the two ma]or ones arc inconsistencies in methods ol as- 
signing school guides and large change^ m the inteiests ol pupils 
Wilh thc«;e limitations, prediction coiielations of .70 aic usually as 
high as may be expected A few* typical lesults are. 

1 Coiielations between achievement-test scores m a particular 
subiect and class grades in that subject usually range from 42 to 70 
(Kohn, 1938, Gates, 1022) Dyer (1918), how^ever, found corielations 
from 64 to 94 betw’eeii College Entrance Boa id language test scoics 
and final maiks m ihc corresponding elementary language courses 

2. The prediction of class grades m a paiticulai course from grades 
in an cailier couisc in the same field laiiges fioiii appioximately 
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.40 to 70 in the usual school or college group. Greene and Jorgensen 
(1936) and Williamson (1937) noted a tendency for the accuracy of 
scholastic predictions to decrease in the higher grades 

3. The prediction of algebra and geometry grades from arithmetic 
tests is usually m the neighborhood of .50, from special aptitude tests, 
55, and from special aptitude and English comprehension tests com- 
bined, .60. (Ayres, 1934, Richardson, 1935, Orleans, 1934, Baier, 
1948; Riegel, 1949.) 

4. Prediction of general scholastic average from one year to the 
next is usually about .60 in large groups taking different courses 
(Finch and Nemzek, 1934.) When all persons take approximately the 
same courses in the same order, correlations approximate .75. 

5. Prediction of general scholastic average from a group verbal 
intelligence test is usually near .70 for elementary school groups, but 
considerably lower for high school and college groups. (Crawford 
and Burnham, 1946.) 

Since the actual correlations are too low for much individual use, a 
large number of studies have been made which show that the com- 
bined scores of several tests often yield slightly better predictions 
than a single score. 

Optimum Age for Instruction 

Typical of studies which aim to determine the most appropriate 
age for teaching a particular skill is that of the Committee of Seven 
of the Northern Illinois Conference on Supervision, reported by 
Washburne (1939). In 1926 this committee began investigations which 
have involved 30,744 children in 255 cities. Their procedure was as 
follows 

1 To define units of arithmetic very precisely and to devise tests to 
measure these units Usually the Compass Andimetic Test or similar tests 
have been used. 

2 To determine the usual grade placement of a unit by a rough survey. 

3 To secure cooperation of school and teachers in the teaching of a 
unit at one grade lower, and at one or two grades higher than the usual 
grade Standard teaching procedures have been carefully described. 

4 To administer five tests* a Verbal Intelligence Test, a Pretest in Arith- 
metic, a Teaching Test, a Final Test, and a Retention Test, 6 weeks after 
the final test All of these tests except the first were practically equivalent 
forms 

5 To discover the relationships between these tests and to suggest opti- 
mal age and grade placements The Committee has usually felt that a unit 
should be taught when three fourths of the children succeed in solving 
75 per cent of the items of a retention test. 
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Per cent 
100 



The results of one such procedure are shown in Ulus. 82, where the 
mental ages of children are plotted against per cents of success on the 
total test. From this figure it appears that S3-per-cent success was 
reached by the average child at a mental age of five years, one month; 
75 per cent at seven years, two 

months; and 100 per cent at illus. 82. GROWTH OF SUCCESS ON 
nine years, ten months. If a HARD ADDITION 

teacher desires any particular 
degree of mastery, the unit 
can be placed accordingly; 
or, vice versa, if a particular 
degree of mastery is found 
among a given group, the suc- 
cess of the instruction may be 
appraised. 

The use of verbal intelli- 
gence tests in this connection 
might be challenged on the 
ground that readiness for a 
particular arithmetic unit is a 
special ability which may not 
depend very much upon the 
skills needed for success on the 
intelligence tests. Much re- 
search is still needed to show 
the relationship between vari- 
ous predictions of arithmeti- j r • i a 4 . 

cal success. The Committee has, however, published fairly detailed 
and useful statements of the skills which have been mastered by 
particular mental age groups, one of which is given below; 

Mental Age 7-8 

The addition facts with sums of 10 and undet are well learned at this 
level, and there is little gain in further postponement The harder addition 
facts and the easy subtraction facts can be successfully learned at this age, 
but there is a definite gain in postponing them to the next level. The e- 
sirability of systematic drill in these facts at tliis level is open to question, 
in spite of the fact that it produces satisfactory results. Many persons feel, 
and there is some evidence to justify the feeling, that the informal experi- 
ences and activities of mental level 6-7 should be continued and expended at 
this level and that systematic drill of all sorts should be postponed to the 

"^Simpfe comparisons of length, height, thickness, width, and the . like, 
including the recognition that one object is two, three, or four times as 


5-2 6-2 7-2 8-2 9-2 

M. A. Groups 

A. Per Cent of Items Passed by MA 

Group 

B. Per Cent of Group Retaining 75 

Per Cent or More Items 

(After Washburne. 1939, Figure I. By 
permission of the National Society for the 
Study of Education.) 
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high, wide, long, thick, or deep as another, are Veil learned Children can 
also readily learn to measure lines in even inches, and, with more difficulty, 
to draw lines an even number of inches long. They can learn how many 
inches there are in a foot and m two feet. 

Children can learn to read the clock on the even hour, to distinguish 
between morning and afternoon, to understand the symbols a m. and p.m. 

Vocational Predictions 

One of the most elaborate studies of prediction of vocational suc- 
cess from school success is that of Thorndike (1934) and his associates, 
who reported the relationships between three sets of appraisals: 
school record of the usual sort, tests during the eighth grade, and 
records of work 10 years later 

Two groups of students in New York City were studied. One group 
of 271 boys and 203 girls was selected from schools which served 
families of low economic status. The other group of 826 boys and 
925 girls represented the entire city population fairly well (the most 
retarded pupils were excluded from school). By diligent work, com- 
plete records were obtained for approximately 78 per cent of boys 
and 82 per cent of girls. The students whose records were incom- 
plete because they moved away were not significantly different from 
the others at age fourteen. Students who could not be located or 
who refused to cooperate were slightly inferior to the others at age 
fourteen in school progress, intelligence test scores, and scholarship 

The school records which were secured, together with their retest 
reliabilities, are shown below: 

1. Age in eighth grade, third month (.99) 

2. Progress during school attendance (.92) 

3. Conduct ( 99) 

4. Sdiolarship ( 96) 

5. Attendance (.99) 

6. Age at leaving school (.97) 

7. Grade at leaving school (.99) 

* 

The tests used in the eighth grade and approximate reliabilities 
for an age group were: 

8. Clerical intelligence, Toops (.85) 

9 Clerical activities ( 80) 

10 Stenquist Assembly Test (40) 

11. lER Assembly Test for girls ( 70) 

12 Arithmetic problems (above .80) 

13 Language tests (above .80) 

14 A combination of tests 12 and 13 called intelligence (85) 

15. Average annual earnings at ages 20 to 22 ( 90) 
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16 \veiago level of ]obs ( 70) 

17 Aveiage liking loi ]()bs (leliabilities not rakulatccl, thought to be 
about 70) 

18 Pei ccni ol times employed (00) 

19 Nuinbci ol ch.iriges ol cmplo^eT (tliouglit to be near 99) 

These retest leliabilities oi school records and vocational recoids 
aic very high The variations were laigcly due to clerical eriors and 
slight shifts 111 standaids The reliabilities ol the tests and estimaies 
used are consideiably low'Ci, although they are as high as is generally 
found ior niateiial oi this soit The test reliabilities are not high 
enough Joi good indnidual prediction, but they arc high enough to 
show gioup trends 

All of these items w eie iiitei cor related in order to show their pre- 
dicine value The results may be summaii/cd as follows 

1, The age of leaving the eighth giade, scholarship and intelligence- test 
scoics picditt lairly wvll the giaclc winch i\ill ))c leached at later ages Indi- 
rectly, this finding is of vocational significance because the grade whuh can 
be completed iiulicated the level of college or prolessional work that may 
be attempted 

2 Among the 223 men and 247 vsomen v\bo did cleiical work for at 
least nine tenihs ol the unie at ages 20 to 22, it was louncl that clerical cain- 
ings correlated with tlic earlier tc^sts of clerical intelligence, 26, wntli clciical 
activities 22, and with scholarship, intelligence, and mechanical assembly 
to smaller degrees Conduct and attendance showed -^ero relationships 
v\ith clciiial eaiiiings Ihc highest toirelalion obtainable v\as appioxi- 
maicly 30 lor boys and 40 for girls The corielations between level of 
jobs and iiitcicst in v\ork and the school iccords and test vsere all nearly 
7cro 

3 In the case ol the 210 men and 155 vsomen v\bo vcorked nine tenths 
of the time at mechanical work from ages 18 to 20, none of the school lecords 
or tests shove eel significant corielations vcith earnings, liking lor woik, and 
interest Tins situation v\as also topical of the 299 men and 76 women 
who had combinations of mechanical and clcTical veoik of other varieties 

4 1 he higher a pupil’s score m clerical and intelligence tests and in schol- 
arship, ilie more likely was he to do better at clerical than at mechanical 
work and vice versa, Init the likelihood v\as not large 

5 Of the eighth grade boys, 20 per cent attended college for at least one 
semester, and of the giils, 12 per cent 

6 About 2 per cent in the group became criminals These wcic inferior 
to the group aveiage at age four teen in all respects 

7 The frequency ol change of employer had slight and probably insig- 
nificant rclationdiips to Gainings, level ol occupation, and iiiteiest 

8 T he annual earning in vcluie-collar ]obs vsas greaici than in mechanical 
jobs among vxomcn, but not among men 

9. There was much evidence that cmplovers did not select employees on 
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the basis of ability alone. Had this been the case, the prediction from eighth 
grade records -would ha-ve been materially higher 

10. The prediction of later success, if the study is carried on further, will 
probably be greater because many persons had not yet shown what they 
could do vocationally. 

This study points to the need of more careful instruments for ap- 
praising abilities at early ages and factors in success at later ages It 
indicates tliat when more precise measurements are available and 
when allowances are made for such disconcerting factors as health 
and raaal prejudice, prediction of vocational success will be of con- 
siderable accuracy. 


NEEDED RESEARCH 

In this chapter tests have been described, but because of lack of 
space, have seldom been criticized or even analyzed to show their 
fine points. A good deal of criticism, however, has been directed 
from time to time at most of the tests listed here Perhaps the most 
serious is that the tests often emphasize isolated bits of information 
which are of little value and are soon forgotten Instead, they should 
emphasize a few important tools and develop problem-solving slfilln 
and atutudes. Thus it has been urged that it is of httle importance 
to know the name of a hero m a novel, but it is important to know 
that he had certain ideals and that he used particular methods to 
solve his problems with particular results. Also, it seems of little use 
to memorize a formula by rote learning or to acquire a great num- 
ber of data which will seldom be used or remembered. The critics 
feel that this undesirable emphasis is strengthened by the use of many 
of the current achievement tests. This is doubtless true to some ex- 
tent. Few analyses of achievement tests have as yet come to hand to 
show what speafic knowledge, reasoning, or other factors they meas- 
ure. A great deal more research is needed to determine more care- 
fully the goals of specific courses and to prepare proficiency tests in 
skills as well as in the usual factual items. It is now possible, however, 
to select a test which apparently emphasizes the particular type of 
knowledge and skill that is desired. And in spite of tlieir defects, 
many present-day achievement tests are much more economical! 
reliable, and valid than those that were available a few years ago. 


STUDY GUIDE QUESTIONS 

1. Which of the cardinal objectives of education are not specifically 
concerned with measures of ability or school achievement? 
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2 How can composition be accurately appraised? 

3. How can adequate word-knowledge tests be developed for a given 
field of knowledge? 

4 What are the best ways of measuring grammar and punctuation? 

5 Discuss the advantages of oral and written spelling tests? 

6 How does the Progressive Achievement Test diagnose difficulties? 

7. Prepare a paragraph-organization test What does it measure? 

8 What advantages are there in using a teacher's diagnostic chart for 
arithmetic difficulties? 

9. What are the main components of physical-science tests? 

10 What are the mam components of social-science tests? 

11 How can significant social science activities be appraised? 

12 What elements are found in clerical tests which are not usually 
found in scholastic achievement tests? 

13. How can the range of abilities in a given grade be compared with 
the range in another grade? 

14 What may accomplishment quotients be used for? 

15 Why are the correlations between language and arithmetic smaller 
in the higher grades than in the lower grades? 

16. What are the usual predictions from achievement tests to success in a 
course of study? 

17. Why do group intelligence tests predict course success about as well 
as specific achievement tests in elementary grades, but not m high school? 

18. How can tests be used to aid in determining the optimum ages for 
instruction? 

19 What factors have in the past seriously limited the possibility of 
predicting vocational success very accurately? 



CHAPTER VIII 


GROUP TESTS OF ABILITY 




Applications of analytical methods to the measurement of abilities 
are discussed m this chapter. Also, current analytical batteries are 
described and compared for content, factorial purity, and practical 
use. 

Primary abilities are not defined as innate traits, but as traits which 
are primary m the sense that they are (a) statistically independent of 
each other, (&) psychologically basic to many types of academic and 
vocational success, and (c) stable over fairly long periods of time and 
not influenced greatly by practice or by recent formal training. 

EARLY TESTS 

During the period when individual tests of the Binet type were 
being developed there was also an active growth in the design of 
mental tests for use with groups All types of tests now widely used 
seem to have been fairly well developed in form before 1910 For 
instance, careful methods for measuring memory span were described 
by Jacobs (1887) and refined by Ebert and Neumann (1905) In 1889 
Cattell and Bryant tried out a number of tests of both controlled and 
uncontrolled association, which were later developed by Jastrow 
(1891). The use of standard arithmetic tests m the study of association 
processes was begun in 1895 by Oehrn and also by Kraeplin. During 
1897 Ebbinghaus published an enthusiastic account of a sentence- 
completion test as a real test of intelligence. In 1891 Kirkpatrick de- 
scribed a rather difficult vocabulary test 

In 1903 Swift published his work on interpretation of fables or 
proverbs Whipple (1908) published fairly elaborate vocabulary and 
anagram tests and in 1909 a range-of-information tests. Cyril Burt 
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reported elaborate experimental tests of higher mental processes in 
1909. In the United States Woodworth and Wells (1911) reported 
their famous association tests, which included verbal analogies, op- 
posites, part-whole, agent-action, species-genus, and hard directions. 
In 1913 Pyle reported an elaborate set of examinations of school 
children that resulted in age norms 
In 1914 Whipple published a 2-volume manual of mental and 
physical tests which included fifty-one tests with directions and norms, 
and about five hundred references to technical reports. Since that 
time the production of technical reports and test revisions has been 
voluminous, and many refinements have been made in administra- 
tion, scoring, and scaling procedures. 

Most of the early examiners followed the hypothesis that persons 
are possessed of a general faculty called intelligence, which can be 
measured by a variety of mental tests They usually wished to ap- 
praise intelligence in order to make practical predictions of some 
sort For an intelligence test they wished to select only those items 
which showed fairly high correlations with some criterion of intel- 
ligence and low correlations with one other. An important applica- 
tion of this method of selection was the development of the United 
States Army mental tests in 1917. The more recent developments of 
military tests are discussed in Chapter XI 
Approximately one and three-fourths of a million soldiers were 
tested during 1917 and 1918 by one or both of the forms developed 
at that time. These tests, which have since been widely applied to 
industrial, prison, and school groups, have also been widely copied in 
both form and idea. As criteria for defining intelligence, the psycholo- 
gists in charge of the work of testing the soldiers decided to use 
combined scores of (a) formal school accomplishment, (b) scores on 
the Stanford-Binet test, and (c) ratings by officers. Groups of soldiers 
for whom these criteria were available were given batteries of thirteen 
preliminary verbal tests Four of the preliminary tests were taken 
from the work of Otis, who generously placed them at the disposal 
of the committee. Several tests show great similarities to those in 
the Woodworth-Wells series. Binet’s and Thurstone’s tests were also 
used or adapted in making up these Army tests. 

The correlations of total preliminary test scores with officers' rat- 
ings of intelligence ranged from approximately .50 to .70; with 
Stanford-Binet Test score, 80 to .90, with Trabue Language Com- 
pletion Scales, .72; with schooling, 76, and the Beta Test, 80 The 
lowest correlations of the separate subtests with total weighted scores 
(Yerkes, 1921, p. 541) were approximately ,65 for tests of oral direc- 
tions, memory span, disarranged sentences, and practical judgment. 
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The highest correlations were approximately ,85 for tests of arith- 
metic problems, verbal opposites, information, verbal analogies, and 
number comparisons The mean oi the correlations between subtests 
for a sample of 895 soldiers was approximately .61, and the subtests 
which correlated most highly with the total scores also showed the 
highest correlations with the other subtests These results led em- 
pirically to the conclusion, which is also mathematically obtainable, 
that it is impossible to secure subtests which will correlate highly 
with a criterion and nearly zero with each other Subtests which had 
degrees of correlation with each other and with total scores similar 
to those shown above were finally selected and combined into a test 
called the United States Army Alpha Test. In constructing the test 
forms, several practical factors were considered The test should 

1. Be adapted for use with large groups of persons who had wide differ- 
ences in ability. (The final test items ranged from those which were answered 
by nearly 99 per cent of a group of adults to those answered correctly by 
about 1 per cent ) 

2 Have a number of equivalent forms to prevent cheating, or coaching, 
or marked practice effects (Five forms of the Alpha tests which were found 
to give very similar results were prepared ) 

3. Be arranged for ease and accuracy of scoring by clerical workers (This 
called for a minimum of writing The answers were usually a single number 
or a check mark indicating a particular choice ) 

4. Render dues of malingering during an examination (The tests did 
not succeed in giving accurate answers to this problem ) 

5 Be interesting. 

6. Be short (The total working time of the Alpha test was limited to 24 
minutes.) 

The Alpha and Beta Tests 

The Alpha Test is a paper-and-pencil battery with eight subtests, 
each placed on a separate page of a booklet and allotted a special 
time limit. The first is a test of span of auditory attention The 
examiner reads directions which the soldiers follow by making lines 
or numbers on the prepared items. The directions become more 
complicated toward the end of the test. Thus, the directions for 
the second item are: 

Attentionl Look at 2 where the circles have numbers in them. When I 
say “Go” draw a line from circle 1 to circle 4 that will pass above circle 2 
and below circle 3. — GOl (Allow not over 5 seconds ) 

and for the eleventh item. 

Attention! Look at 11. When I say “Go” draw a line through every even 
number that is not in a square, and also through every odd number that is 
in a square with a letter — GO! (Allow not over 25 seconds ) 
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The second test is a 5-minute test of twenty arithmetic problems. 
The third test involves common sense or practical judgment, 1% 
minutes being allowed for the sixteen items. The fourth test allows 
1% minutes to check forty pairs of words to show whether they are 
the same or opposite. The fifth test allows 2 minutes to rearrange 
twenty-four sentences that had been disananged in a random fashion. 
The sixth test requires the completion of twenty number series, 
allowing 3 minutes. The seventh test consists of forty verbal analogies, 
with a working time of 3 minutes The eighth test allows 4 minutes 
to check forty multiple-choice items of miscellaneous information. 
A short time is allowed to consider samples of the next test. 

In order to provide a test for men who could not read English, a 
selection and standardizing procedure similar to that of the Alpha 
Test was followed, beginning with fifteen nonverbal tests. The re- 
sult is called the United States Army Beta Test. 

This is also a paper-and-pencil test with seven sub tests It was 
designed so that it could be demonstrated largely by pantomime and 
without many words. Before each test the examiner and a demon- 
strator showed on a large blackboard how the work was to be done. 

The first test consists of five mazes, of which the first two were 
traced on the blackboard by the demonstrator Then, when the 
soldiers understood what was wanted, they were told, '‘All right, 
go ahead Do it. Hurry up." The idea of working fast was impressed 
on those who were working slowly, because only 2 minutes were al- 
lowed. The second test requires that sixteen pictures of piles of 
cubes be viewed, and the number of cubes in eacli be written down, 
2% minutes. The third test is a nonverbal series completion in which 
the pattern of x's and o’s is to be completed in each line according 
to the way it is printed at the start of the line with a time allowance 
of 1% minutes In the fourth test the examinee is required to as- 
sociate symbols with numbers on a sheet according to a code placed 
at the top of the page. Two minutes are allowed The fifth test al- 
lows 3 minutes for comparing pairs of numbers and marking with 
an X those pairs drat are different. The sixth test allows 3 minutes 
for drawing in missing parts of twenty printed pictures In the seventh 
test ten rather easy paper form-board problems are allowed 2% 
minutes. The task is to draw lines to show how the small pieces 
would fit into the large figure. 

CURRENT GENERAL ABILITY TESTS 

The period since 1918 has witnessed the production of many group 
tests of mental abilities, some ostensibly to appraise intelligence 
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and others to evaluate observing, reasoning, 'and learning in special 
situations. Most of these tests extended the scope of the Army tests 
by adapting them for use in the lower grades. Tests by Otis and 
Pressey appeared in 1918. Haggerty (1920) and Whipple (1919) pub- 
lished standardized tests. A group of affiliated psychologists produced 
the National Intelligence Test in 1920, and Terman published a 
group test in 1926. 

Group tests are usually presented in printed form. Oral group 
tests that consist of verbal and number multiple-choice items and 
problems were standardized by Stump (1935) and by Langmuir 
(1946). The advantages of these tests are that they {a) allow all 
students the same amount of time for each item, {h) allow each 
student to attempt every item, and (c) obviate the expense of printed 
forms. Two alternative sets of questions have been prepared for 
grades 4 to 8 by Stump, and for a wide range of adults by Langmuir. 

Snedden (1927) experimented with a vocabulary test diat was 
disguised in a questionnaire on traits that may have some heredi- 
tary significance. The subject was asked which of his two par- 
ents possessed the greater amount of a certain trait — ^being gentle, 
meticulous, sanguine, etc. Seventy-five words of graded difficulty were 
selected and standardized on several hundred persons Correlations 
with Stanford-Binet MA’s were found to be in the neighborhood of 
.70. The interview form was further developed by Maizlish (1936), 
who issued a like-dislike questionnaire, and asked the subject to give 
reasons for his answers The reasons showed whether or not the word 
was understood. This test, when given to individuals, was found to 
correlate .77 with Kuhlmann-Anderson tests; when given to a group, 
the correlation dropped to .60. 

A popular variety of intelligence test for adults, known as the spiral 
omnibus type, was designed to eliminate the need for accurate 
timing of short periods. A test was desired which would give about 
the same total scores as the United States Army Alpha Test during 
a 20- or 30-minute period of continuous work Hence, materials much 
like those of the Alpha were mixed together by rotating or spiraling 
among the tests (Ulus. 83). 

A list of most of the tests now available is given in Appendix II. 
Many of these tests are short (usually from 15 to 30 minutes are re- 
quired for taking them), but a few are extensive and require from 
1 hour to 2 hours. They are heavily loaded with language or language 
analogies, but also include arithmetic and occasionally a spatial item. 
They all yield a single score, which is usually converted into MA, 
IQ, and centile. 

At the college level Thurstone (1919) inaugurated a series of 
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ILLUS 83 O’ROURKK CL\I«R VL CLASSIMC\TrO\ TEST, SLVIOR GR \DE 
Grade of difficulty. Iligli-school senior, College freshman 
Directions and Sam phi 

The directions and samples on this sheet are to frho\i you the kinds of items you 
smU find in the test Study them raiefully so you \viiJ know how you aie to answer 
each kind of item There will be no directions nor explanations in the test 
You are to write your answ'crs always on the line at the right of the item 
You will be allowed 10 minutes to study the samples on the iiont and back of 
this page and answer those not answered 

WRITE 

Meaning If the w^ord in capital letters fits into the meaning of the ANSWERS 
sent enre, w rite “ correct ’ on the Ime at the right. If the meaning is HET^ 
not correct, write the number of the word which docs fit 
Example The clear day w as DETAILED foi the picnic (1) OR- 
DERED (2) INVENTED (3) IDEAL (4) EXCUSED (S) DIS- 
COVERED . . . 3 

“DET-VLLED” does not fit the meaning of the sentence, but 
“IDEAL,*' which is marked “3,” docs so “3 ’ is written on the line 
at the right 

Relations The first txxo words in each «5Ct arc related in some way 
W rue a word which is related to the third word in the came way as 
the second is to the first The woid you write must begm w ith the 
IcLtei before the answer line 

Example SHOE is to FOOT as II \T is to H Head 

“Head” IS wiitten on the line at the right, because it bcains with 
“ II ** and a hat is w oru on the head just as a shoe is w'orn on the foot. 

Information Fixe ways of completing the statement are suggested. 

Write the number of the one which makes the true statement 
Example Water is heavier than (1) paint (2) granite (3) iron 

f4) wood (5) sand 4 

Spelling If every word m the sentence is correctly spelled, write 
“correct** on the line at the right If you find an incorrect woid, 
spell that word coriectly on the line at the right 

Example His education x\ as a great adx’antagc advantage 

Opposite Writ c the number of tJie word w luch means the opposite of 
the word m capital letters 

Example The opposite of LIGHT is (1) Lmted (2) small 
(3) bright (4) dark (5) damp . . 4 

“Dark,** marked “4 * means the opposite of “LIGHT '* so ‘ 4” is 
written on ihc line at the right 

Grammar If the sentence is grammatically conect, write “correct** 
on the line at the right If you find the sentence incorrect, wnte 
what would with the least changes express the meanmg correctly 
Example The books is sold at the store . . . are 

(By penmssion of L J O’Rourke, 1 935 edition, and The Psvchological Institute, 
3506 Patterson St , N W , Washington, DC) 
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mental tests for college entrance under the auspices of the American 
Council on Education* Annual editions have been given to thousands 
of high school graduates. This series consists of tests of such types as 
vocabulary, mathematics, verbal analogies, and learning an artificial 
language, which are administered in periods so short that few if any 
students complete the work. Similarly, Thorndike began a series of 
intelligence tests for college entrance (1920). This series includes tests 
of word meaning, mathematics, and special information, for which 
230 minutes are allowed A similar excellent senes called the Ohio 
State University Psychological Test has been designed by Toops 
(1937), and another series by the College Entrance Examination 
Board 

In order to reduce the effects of training on test scores, R. B Gattell 
(1940) developed his Culture-Free Test for a wide range of adult 
ability. 

The Culture-Free test is a paper-and-pendl test consisting of seven 
parts It is unique in that it is a power test in a field where speed 
is usually stressed, and it involves two or three variables in making 
deductions. The Psychological Corporation applied an experimental 
form to two groups, each of approximately one hundred boys One 
group was composed of vocational high school boys, the other of 
academic high school boys. Items showing significant differences be- 
tween the two groups were retained Similar item-validation studies 
were made among groups of college students, seventh- and eighth- 
grade pupils, and two hundred psychology majors. As a result Cattell’s 
maze tests were dropped and the other tests revised. The 1945 edition 
consists of 

1. Classifications, Fifteen items, each accompanied by six little pictures 
or diagrams The person being tested is to find and mark two in each row 
which do not belong with the others This is a form-analogies reasoning 
test where size, direction, shape, and shading are varied. (10 mm.) 

2. Pool reflections Nine rows of six small pictures are to be inspected 
to find which one of each six is the exact mirrored drawing of a key picture 
above the row. (10 min ) 


SAMPLE 



A 


C 

D 

O 

E 

A 

F 
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3 Se7tes completion Filteen items, each consist of three small pictures 
at the Iel( and six on the light "Ihc task is to decide', from looking at the 
lelt SCI ICS, T\hich should come next, then to select it from the six pictures 
at the righl (20 min ) 

4 \Itiin(es i-item relational Eleven nenis, each consist of a gioiip of 
lliice little j)ic tines and a blank ariangecl in a sejuare on the left and 
SIX other pictuies on the light The task is to select lioiii the six the jiiclurc 
islncli mil complete the squaie on the lelt This is a c()m[5lction test in 
\\hich the j)attcin must be cleteimined by making comparisons in both 
vertical and horizontal diiections (> min) 

5 Matufes 9-Uem relational Each ol the clesen items consists of ciglit 
small pictures and a blank arranged in a scpiare Si\ other pictuies to choose 
from aie bclou the eight The task is to make the pattern in the scjuaie 
look riinshecl balancecl, and complete (7 min ) 

SAMPLE 



D E F 


6 Matrices 9-item cyclical Each of eleven items consist of spaces for 
nine small pictures Ihe lower iight-hand space is always a blank to be 
filled in horn six other small pictures b(‘low T-his test is moic dillicult 
tlian the fifth test, bciause instead ol having eight small pictures in the 
matrix, there are now onl} thice or four The rest have been torn away, 
and must be supplied b) inicrcnccs benvecn the inatiix and the possible 
choices (20 mm ) 
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Although the tests have time limits, they are long enough to allow 
the pupil to finish as much as he can. The items are steeply graded. 
All of the tests primarily require reasoning with 2-dimension pic- 
tures, using differences of space, size, shading, and direction. Per- 
ceptual speed is not important because of the generous time limits. 
The fact that this is a measure of general mental ability is stressed. 
It may more descriptively be called a complex deduction test using 
static pictures. Motion and time sequences are not involved Addi- 
tional evidence is needed to show the effects of different cultures 
on this test The split-half reliability of the whole test was 88 when 
it was given to 121 high school pupils. All of the parts correlate 
from 50 to .80 with the total score, with Test 3 showing the highest 
correlation. The correlation of the test with Army Alpha was found 
to be about 50, and with the Minnesota Paper Form Board about 
.60, Fairly adequate norms are now available for high school fresh- 
men and seniors. 

Two-Factor Tests 

Better predictions have often resulted from two separate tests (one 
for language and the other for number) than from one test. Among 
the several tests now available which provide two separate scores 
are: American Council on Education (ACE) Psychological Examina- 
tion, California Tests of Mental Maturity, College Entrance Board 
Scholastic Aptitude Tests, and Shipley Institute Conceptual Quotient, 
Hartford Retreat, Conn 

The American Council on Education Psychological Examination 
is published each year m a new but equivalent form prepared by 
L. L, Thurstone and Thelma G. Thurstone, and is used by more 
than six hundred colleges and universities and many counseling 
centers. The test on the college level requires an hour for administra- 
tion and consists of six sections, each of which is preceded by prac- 
tice problems, such as, 

1. Arithmetic, (Some of these problems require the use of decimals and 
fractions.) 

la this test you will be ^ven some problems in arithmetic After ea^ problem there are five answers, 
but only one of th«n is the correct answer You are to solve each problem and blacken the space on the 
dieet vrhich cqrre^nds to the answer you think is correct The following problem is an example 


1. How many pencils can you buy for 50 cents at the rate of 2 for 5 cents^ 
(a) 10 (b)20 ‘ (c)25 (d) 100 (e) 125 


Pind on the answer riieet the space labeled “ARITHMETIC, Practice Problems, Page 3 ” The correct 
answer to the problem is 20, which is answer (b) 

In the row numbered 1, space (b) has been blackened. 
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2. Completion 

Look at the following definition. You are to think of the word that fits the definition. 


1. A contest of speed. 




B F 

M 

P 

R 


3. Figure analogies 

Look at the fibres A, B, and C in Sample 1 below. Figure A is a large circle. Figure B is a small 
circle. By what rule is Figure A changed to make Figure B ? The rule is "making it smaller.’* Now look at 
Figure C. It is a large square. What wilHt be if you change it by .the same rule? It will be a small sqtiare of 
the same color as the large square. Figure 2 is a small white square. In the section of the answer sheet labeled 
"FIGURE ANALOGIES, Practice Problems, Page 7," the space numbered 2 in the first row has been 
blackened to indicate the correct answer. 

A B C 1 2 3 4 S 


/Qo □ ■□□oO 


4. Same-opposite 

The word at the left in the following line is "many.’* 


I. many (1) ill (2) few (3) down (4) sour 


One of the four words at the right means either the same as or the opposite of ’ ‘many. ’ ' The word ‘ ‘few, * ' 
which is numbered 2, is the opposite of ‘‘many.’’ In the section of the answer sheet labeled "SAME- 
OPPOSITE,. Practice Problems, Page 9," space number 2 in the first row has been blackened. 

5. Number series 

The numbers in each series proceed according to some rule. For each series you are to find the yiext 
Wifnber. 

In the first senes below, each number Is 2 larger than the preceding number. The next mmtber in the 
series would be 14. Of the five answers at the right, answer (e) is, therefore, correct. In the section of the 
answer sheet labeled "NUMBER SERIES, Practice Problems, Page 11,” space (e) in the first row has been 
blackened. 

Series 

1. 2 4 6 8 10 12 


6 . Verbal analogies 

Read the following words: 

1. foot-shoe hand- (1) thumb (2) head (3) glove (4) finger 


Next Number 

10 11 12 13 14 

(a) Cb) (c) (d) (e) 


The first two words, foot-shoe, are related. The next word is hand. It can be combined with one of 
the remaining words in the row so as to make a similar pair, hand-glove. In the section of the answer sheet 
labeled "VERBAL ANALOGIES, Practice Problems, Page 13,” space number 3 in the first row has been 
blackened. 

The number of right answers in Section 1, 3, and 5 are combined 
to give a Q score (quantitative), and Sections 2, 4, and 6 yield an 
L score (language). A total score is also given. All scores for large 
gr pups are changed to cen tiles. 

Shipley (1946) issued a short but effective scale in two parts for 
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measuring intellectual ability, which also yields an index of intel- 
lectual impairment. The first part is a steeply graded literary vocabu- 
lary test. Ten minutes are allowed for the forty items, all of which 
are of the 4-choice variety. The second part is an abstraction test. 
Ten minutes are allowed for the twenty items, each of which re- 
quires a series completion. Numbers or letters, or both, are used. 
Both parts are probably little affected by the lime limits, for the 
items use progressively rarer words or more difficult problems. 

Reliability coefficients for 322 Army recruits were 87 for vocabu- 
lary, .89 for abstraction, and .90 for the total. Raw scores may be 
changed into vocabulary age, abstraction age, and mental age, rang- 
ing from about eight to twenty years. 

The index of impairment, called the Conceptual Quotient (CQ), 
is based on the clinical experience that in mild degrees of mental 
impairment vocabulary is relatively unaffected, while capacity for 
conceptual thinking or abstraction declines according to the degree 
of impairment. A table which allows one to read the CQ from the 
raw scores is provided. Approximately the same results may be ob- 
tained by dividing abstraction age by vocabulary age. All CQ's be- 
low 100 are in the direction of impairment and are interpreted as 
follows; 


CQ. 

Classification 

Per Cent of Normal 

above 90 

normal 

73% 

85-90 

slightly suspicious 

10 

80-84 

moderately suspicious 

7 

7B-19 

quite suspicious 

5 

70-74 

very suspicious 

3 

below 70 

probably pathological 

2 


Shipley points out that for those with vocabulary scores above 32 or 
below 23 these conceptual quotients are not useful, because the usual 
relation between vocabulary and abstraction skills does not hold. 
Also, CQ’s above 90 are commonly found with psychoneurotics and 
early psychotics. Chronic psychotics, however, almost always show 
losses in abstract thinking. 

THE DEVELOPMENT OF ANALYTICAL TESTS 

A few years ago many tests of ability were produced and widely 
distributed with the assurance of the authors that they had high re- 
liability, and would predict a certain type of success to a moderate 
degree. Then it became apparent that neither reliability nor validity 
would indicate what was being measured, particularly when, as is 
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usually the case, the criterion of success was complex and roughly 
appraised. 

Later, multiple-correlation techniques were introduced to improve 
predictions of success These techniques weight the tests in a battery 
according to the degree to which they correlate with the criteria. 
While they yield somewhat better predictions, these techniques also 
result in the selection of tests which measure unknown amounts of 
unknown factors, and which seem to overlap each other in content. 

Since 1940 a new goal has been emphasized by certain authors of 
tests The factorial purity of a test or test item has become a major 
consideration Homogeneous or pure tests are defined as those whose 
variance is due to only one factor. They have the great advantage 
of yielding a definite interpretation which is relatively stable for 
the populations used. Pure tests also allow more economical 
measurement than an equal or greater number of less pure tests. 
Furthermore, they allow a profile of independent traits, which is 
more revealing than a single score. 

Factorial analyses, when properly used, will provide evidence for 
the construction of unique tests. The advantages and limitations are 
given in Chapter XIV. 

DESCRIPTION OF BATTERIES OF APTITUDE TESTS 

In recent years the results of testing have led to general recogni- 
tion of about ten large groups of factors, each of which contains from 
two to six fairly independent kinds of subfactors, which are sometimes 
called unitary or primary abilities, because they seem to exist in- 
dependently in the populations studied. None of the authors of 
analytical tests claims that he has developed a pure measure of an 
ability, but much progress toward that end has been made. 

The four batteries which will be reviewed in the following pages 
are Chicago Tests of Primary Abilities, Guilford-Zimmerman Apti- 
tude Survey, Differential Aptitude Test, and General Aptitude Test 
Battery. These batteries were chosen because they represent con- 
siderable careful research and are available The significant batteries 
used recently by military authorities are described in Chapter XI. 

The Chicago Tests of Primary Abilities (1943, 1946, 1948) 

Thurstone and Thurstone (1943) issued the Chicago Tests of Pri- 
mary Abilities, a group of eleven tests which measure six abilities, 
after their 20 years of extremely important pioneer work in develop- 
ing and applying methods for analysis of abilities The 1943 battery 
was prepared after giving preliminary tests to Chicago school chil- 
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dren at each grade level from the fifth to the twelfth, a total of ap- 
proximately twenty-five thousand pupils. The First-Grade Battery 
(1946) was designed for five- and six-year-olds, and the Elementary 
Battery (1948) for ages seven to eleven. The autliors believe that 
the batteries furnish profiles of abilities which indicate weakness 
and strength in academically and vocationally important fields. In 
selecting the tests they considered 

1. The factorial saturation or purity of the test Only one primary 
factor is conspicuously present in each test, 

2. Stability of factorial saturation at age levels from eleven to 
seventeen years. Tests which, in this respect, tended to fluctuate 
considerably over the years were not retained. 

3. Clear psychological interpretation. 

4 Availability of parallel forms. Two forms are available. 

5. Ease of administration 

6. Ease in scoring, either by hand or by machine. 

The high school battery is a single booklet in which each ability, 
except memory, is measured by two tests. Before each test elaborate 
explanation and practice periods are given so that, while the total 
working time is 58 minutes, about 2 hours are needed for test ad- 
ministration. The tests are briefly described here. 

N Number: The first test allows 6 minutes for seventy simple problems 
in addition. Each consists of four 2-place numbers which have been “added “ 
The second test allows 5 minutes for 70 simple multiplication items, in each 
of which a 2-place number has been “multiplied” by a single digit In both 
tests one is to determine whether the right answer has been given, then to 
mark a space to indicate right or wrong 

V Verbal Meaning' The first test provides 4 minutes for a 50-item vo- 
cabulary test, largely of literary terms One of four words which has the 
same meaning as a key word must be underlined. The second test allows 
6 minutes for forty-five completion items. In each item the one being tested 
reads a short definition, then marks the one of five letters which is the 
initial letter in the word defined, dius* 

“The first meal of the day.” 

The defined word is breakfast, so the space after B should be marked. 

5 Spatial Thinking The first test allows 5 minutes for twenty items, in 
each of which the first figure in a row is to be compared with six other figures. 
The person being tested is to mark those which would be identical with 
the first figure if they were appropriately rotated and turned over. The sec- 
ond test gives 5 minutes to twenty items, in each of which the first picture of 
a card is to be compared with six other pictures in a row. The testee is to 
mark each card which, if it were slid along the table and rotated, would 
fit the first card. Example* 
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Here are more cards. Some of the cards are marked, the cards which are like the first 
card in this row are marked. 

A B C D E F 




W Word Fluency The first test allows 5 minutes to write as many iv'oids 
that begin with a given letter as possible The second test proMdcs 4 min- 
utes to write as many words as possible that have four letters and ^\hich 
begin with another given letter 

R Reasoning* The first test allows 6 minutes for thirty letter-senes items, 
in which the person tested is asked to select from five choices the letter 
which would come next The second test allows 4 minutes for thirty letter- 
grouping Items. Here the testee is to detect and mark in each item one 
group of four letters that does not belong with the other groups, thus 

AABC ACAD ACFH AAGG 

Three of the groups have two A's 

M Memory Twenty cards with first and last names are exposed 15 sec- 
onds each, one right after the other Then 8 minutes are allowed for c hoosiiig 
and marking the right first name for the twenty last names. The choice 
must be made from among the seven first names for each last name 

ILLUS 84. INTERCORRELATIONS OF COMPOSITES 
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V 

.40 

54 




s 

28 

17 

.16 



M 

31 

.36 

.35 
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R 

.53 

.49 
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.29 
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Primary Abilities 



N 

W 

V 

S 

M 

R 

Composite Score N 

.90 

44 

39 

33 

.21 

57 

Composite Score W 

.43 

.91 

54 

20 

39 

47 

Composite Score V 

41 

.52 

97 

19 

.38 

58 

Composite Score S 

22 

15 

.15 

92 

.13 

34 

Composite Score M 

31 

.37 

.36 

.14 

.79 

11 

Composite Score R 

.52 

51 

.57 

34 

.38 

90 


(By permission of L L Thurstone, T G. Thurstone, and Science 
Research Associates) 
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Thurstone has furnished split-half reliability figures for large sam- 
ples of pupils from each half of grades six, eight, ten, and twelve. 
When two tests are combined for each factor, the reliability coefii* 
cients are all at or above .96 except for memory, which increases from 
in grade six to .82 in grade twelve. The reliability of Word 
Fluency is not given, since it does not lend itself to the split-half 
method Reliabilities of individual tests are somewhat smaller, but 
still satisfactorily high. 

Two factorial analyses are reported. First, the raw scores were 
correlated and their matrix (Ulus. 81) resolved. Six primary factors 
were found, and their estimated intercorrelations formed a new 
matrix The second factorial analysis was made of this matrix. 

From the first analysis the loadings of each factor in each com- 
posite score were found Each composite was found to have a high 
loading of one factor, and relatively low weights of the other factors 
(Ulus. 85), which shows that each of these tests is a fairly pure measure 
of one factor only. The Spatial Thinking Tests are apparently the 
purest. While the Reasoning Tests have a loading with the reasoning 
primary of .90, they are the least pure, because they also have loadings 
above .50 with Number, Word Fluency, and Verbal Meaning. 

From the second factorial analysis the loadings of a general factor 
on each of the estimated primary factors appeared. The results indi- 
cate that Reasoning has the heaviest loading and Memory and Spatial 
Thinking the smallest, and that all the intercorrelations can be well 
explained by one general factor, which Thurstone did not name at 
the time. It seems probable that it will be found to correspond to 
energy, motivation, and other personal factors, but more research is 
needed, based on the use together of all types of appraisals. 

The First-Gyade Batieiy. From the application of seventy tests 
to two hundred first grade pupils, Thurstone and Thurstone (1946) 
have shown that five factors— Verbal Meaning, Perceptual Speed, 
Quantitative Thinking, Motor Coordination, and Spatial Thinking 
—appear among five- and six-year-olds, and that these factors can be 
measured with sufficient accuracy to provide mental age scales for 
2-month intervals for the ages of three to nine years (Chapter V). 
The profiles are valuable for growth studies and for prediction of 
success in the early grades This is an especially interesting contribu- 
tion, because it shows that reliable measures of what probably are 
basic aptitudes can be made before formal schooling usually begins. 

Science Research Associates (SRA) Primary Mental Abilities^ 
seven to eleven years and eleven to seventeen years. In 1947 the 
Thurstones issued these batteries of five tests each, m the belief that 
a profile of a student s learning abilities is more useful for indicating 
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his intellectual strengths and weaknesses than a single IQ score. 
These five tests are identical with the first tests of the first five factors 
in the Chicago Tests of Primary Abilities (1943) described above. 
The second tests for these factors and the Memory Test (M) are 
omitted from the shorter SRA 1947 edition. 

The more recent battery has nearly the same reliability and fac- 
torial composition as the earlier battery, but the time allowances 
have been cut in half (26 minutes) and the answers are to be placed 
on one side of an automatically carbon-scored answer sheet. Indi- 
vidual-profile sheets include the five primaries, and also total scores 
which yield IQ's. The distribution of IQ’s is arbitrarily set to have 
a standard deviation of 16.5 points, which is very similar to that of 
the Stanford-Benet IQ’s. (No thorough comparison of these two scales 
has yet come to hand.) The individual-profile sheet also gives norms 
for one-year age groups. 

Finally, a short interpretation of scores, only part of which is given 
here, is directed to the persons who take the test. 

People used to think that intelligence was just one ability, and that every 
person was born with a certain amount of it that remained about the same 
throughout life. Now we know that intelligence is made up of many differ- 
ent abilities, and that under certain conditions these abilities can be im- 
proved. 

Like most people, you are undoubtedly higher in some PMA’s than in 
others. You should concentrate on activities related to your high PMA’s, 
because you probably have the greatest chance for success in these. The 
higher you already are in a PMA, the more you probably can increase your 
ability to solve problems and do good work of that type through further 
training and practice. But you should not neglect the PMA’s in which you 
are low. While you may have more trouble with activities in these areas, 
you can probably improve yourself through training. Through training 
your Primary Mental Abilities, you are really learning how to think betterj 
which is most important for your success in later life. 

The paragraphs below tell you what each PMA score means. For easy 
reference you may enter your percentile ranks in the boxes located at the 
right of the paragraphs. 

Verbal Meaning is your ability to understand ideas expressed in words. 
It is needed in activities where you get information by reading or listening. 
High ability in V is especially useful in such school courses as English, 
foreign languages, shorthand, history, and science, V is needed for success in 
such careers as secretary, teacher, editor, scientist, librarian, and executive. 

Space is the ability to think about objects in two or three dimensions. 
Blueprint reading, for example, requires this ability. The designer, electri- 
cian, machinist, pilot, engineer, and carpenter are typical workers who need 
ability to visualize objects in space. S is helpful in geometry, mechanical 
drawing, art, manual training, radar, physics, and geography classes. ... 
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The scores in these five areas give you a general picture of your present 
ability to deal with intellectual problems. While the results of your work 
on this test are important, these PM A scores should not be considered the 
only index of your likely success in school or in later life There are other 
areas of intelligence which were not measured here. Tests for them would 
take too long to administer. Other factors, such as your personality, vo- 
cational interests, and how hard you work also have an important bearing 
upon your chances of success. 

The SRA Primary Mental AbthUes are merely a shortcut for finding out 
about your ‘intellectual self ’ They help you to undei stand yourself better — 
and thus to recognize your strengths and weaknesses They can assist you 
m planning your school courses, career dioices, and leisure activities wisely. 
The better you know yourself, the more successful and satisfied you can 
become, 1 

The Guilford-Zimmerman Aptitude Survey (1947) 

Guilford and Zimmerman have issued seven tests of primary abili- 
ties which they believe will be much more effective m vocational 
guidance and personnel selection than the usual tests of intelligence 
or of clerical and mechanical ability. They also believe that a fairly 
complete battery would probably include twenty tests of primary 
abilities, and they hope to prepare all of these tests eventually. They 
emphasize that a comprehensive series of tests, each of which is fac- 
torially unique, has the following three advantages: 

a. The one factor which dominates each test and the degree to 
which It determines the scores is known Only m this situation can the 
meaning of a score be clearly known. Tests where two or more fac- 
tors are present in unknown amounts can never be clearly inter- 
preted. 

h. A battery of unique tests lends itself to an enlightened selec- 
tion of tests for a particular purpose, and produces a combination 
of weighted scores which will yield the highest possible prediction 

c. Batteries of unique tests are the most economical because they 
eliminate unnecessary overlapping of items and include a systematic 
minimum sampling of each important factor. 

The seven tests which he designates by Roman numerals are as 
follows: 

1. Verbal Comprehension is a wide-range, 25-minute, vocabulary 
test of literary, nonscientific words. There are seventy-two items, each 
of which requires the person taking the test to select from five choices 
a word “which has a meaning like the word in large type ” 

1 By peimission of L. L, Thurstone, T. G Thurstone, and Science Research As- 
sociates. 
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II Gcneial Reason mg is a test of ability to “cluignose problems" 
as inclicatccl by maihematKal-reasoning tests Tlijity-ii\c minutes are 
allowed ior twenty-seven items, each oL which in\olvos selecting the 
coirect answer loi a problem irom among live choice'). Algebia is 
helpful in this test 

III Xiitneiical Opoations Fight minutes aie alloived ior one 
hunclied and eighty items ol simple aiiihinetical computation 

IV Peiccptiinl Speed Five iiumite') aie allowed lor seven u -two 
Items where loui black silhouette^ are to be matched with four ol 
five othei silhouettes 

V Spatial Oucnlaiion is designed to measuie ability to see changes 
in direction and position Each item consists ol two small pictures 
of some water and land and the liont end ol a motorboat in which 
one IS to imagine he is riding He is iccjuircd to check one of five 
choices to indicate whetlici in going between the first and second 
picture the boat has turned to the light oi to the lelt, and is pointed 
higher or lower. Ten minutes aie allow'ed lor the sixty-four items. 
(See Ulus 86) 

VI Spatial Visnahzation allows 30 minutes for sixts-eight items 
Each Item icqunes one to choose, liom five pictures, the picture 
which show's how an alarm clock would look if it were turned, tilted, 
and rotated a given numbci ol degrees 

VII. Mechanical Knowledge allows 30 minutes for filty-five items 
From five answers the one which tells how a mechanical device la 
used or defined is to be chosen. The fust twenty items contain pic- 
lures and words, ihe others use woids onh Items arc clioscn to 
show knowdedge needed by such skilled workers as auto mechanic, 
plumber, caipenici, and electrician 

Tests 1 and II arc designed to be power tests. 'J’heir items aie 
steeply graded over a wnde range ol dilliculty, and the time limits 
are long enough to allow' nearly all to attempt every item Tests III, 
JV, and V aie speed tests wath items of nearly the same difficulty and 
time Innits so short that lew finish them Test VI is a speed-aiid-pow'er 
test in that the items become progressively more difhcult, but the 
tunc I unit of 30 minutes is long enough loi the more rapid workeis 
to finish Test Vll is a bread tli-oi-in format! on test, in which the 
items vary somewhat in difficulty. 

Answ'er sheets aie jnovided lor all ol the tests, except speed tests 
III and IV Separate norms are given for test V when taken with and 
without answer sheets 'Ihe use of answ'er sheets ior test V is not 
leconimcndcd, because they tend to iritioduce other components of 
perceptual speed and numbci. 
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ILLUS. 86 THE GUILFORD-ZIMMERMAN ALTITUDE SURVEY 


Part V Spatial Orientation 

Form A 


Nome. 


.Date. 


Score. 


Nearest age 10 15 20 25 30 35 45 55 65 75 Sex M F 

Yevrs of school completed 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 

lffftr(rcffon$.^This is a test of your ability to see changes in direction and position In each item you 
are to note how the position of the boat has chonged in the second picture from its original position in the 
first picture 


Here is o samp/e item 

Tliese are the five possible answers to the item* 


These are ftny pictures of the 
boat's prow 


This IS the correct answer It 
shows thot the prow of the boot 
has dropped below the aiming 
point 



This is the prow (front end) of 
a motor boat in which you are 
riding 


This IS the aiming point, ft is 
the exact spot you would see 
on lond if you sighted right 
over the point of the prow 


This is the some aiming point 
shown above Note that the 
prow of the motor boat has 
dropped below it. 


(If the prow had risen, instead of dropped, the correct answer would have been C, instead of D ) 

(By permission of J P Guilford and the Sheridan Supply Company ) 


These tests were published so recently that norms are available 
for only male college students. Other norms are soon to be added. 
The reliabilities are in the .90s. The validity of each test as shown 
by its correlation with its dominant factor is estimated at .60 or 
above. Tests I, III, and VII are purest in that they have estimated 
validities of .80. 

The tests are designed to measure independent factors. Tests II 
and III, both of which use numbers, show intercorrelations of only 
.20. The authors estimate that the true intercorrelations of factors 
are probably very small. The actual intercorrelations of the tests 
range from 0 to .55 Tests V and VI, both of which are concerned 
with spatial thinking, correlated .55 

A tentative list of occupations is given showing the factors which 
are probably important for each occupation. Thus, for airplane 
pilots tests IV, V, VI, and VII are indicated, and for accountants 
tests II and III. Critical scores for various occupations are to be pre- 
pared. In designing this battery the authors have drawn from their 
extensive and intensive research in the Army Air Force. 
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DlfTerential Aptitude Test (DAT) 

1 he Dillcientinl Aptitude Teat is a battery ol ci«hi tests released 
by Bennett, Seaslioie, and ^Vcsnian (1917) to pi ovule integiated 
measures oL iiKlepeiident abilities lor educational and \ocaiional 
guidance and (or einplo)inent selection All the tests except cJencal 
speed and accurac) aic po\\ci tests in that the) become piogies- 
snei) moie difTicult, and lather libcial time allowances arc proMcled 
SiK '10-miiiLite sessions oi thiee SO-iiiinute sessions are lecominended 
IBM answer sheets aic used throughout The tests, which arc punted 
in separate booklets, arc 

1 Vnhal Reading "llie first and ilic last woid ol each of fifty short 
sentences arc omitted The blanks are to be filled in w'lth a number or a 
letter from loui choices (SO min) I'he following is an example ol a sen- 
tence 


IS to watei as eat is to 

J Coniiniic 2 Dunk S Foot 4 Girl 

A Drive l^ J ncniv C lood D liidiisUy 

"J he collect choices, 2 and C, are to be indicated on an answer sheet 

2 \innerical Ability is measured bv forty problems which range fioin 
simple computations to simple ratio and scpiaie-root pioblems llach prob- 
lem IS lol lowed by four-answer choices and the statement “None ol these ” 
(2J0 mm ) 

3 Abshact Reasonnig includes fift) noiilanguage items, cacli of wdiicli 
shows a senes oi lour figures wliuh is to be extended by choosing one from 
among fi\e answer figure’* (2'> mm ) 


PROBLEM FIGUKbS 


AXSW’bK nCURES 



A Sfiofe Relations In ihis test each item consists ol a 2-diniensional pat- 
t(‘rn which could be lolded into one ol the five .H-dimcnsional (d)|e(l'* pic- 
tured riie forty ptUterns are coinjilicated by the use of grav or shaded 
surfaces (30 iniii ) 
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5. Mechanical Reasoning is measured by sixty-eight pictures showing 
various applications of mechanical principles — the balancing or lifting of 
weights, propellers, gears, pulleys, and condensation Following each picture 
are 3-choice questions asking which is the heavier part, or in which direc- 
tion would a part turn, or which part turns more slowly, or which is colder; 
etc. (30 mm ) The pictures are unusually well produced 
Example: 


X 

Which man has the heavier load’ 
(If equal, mark C ) 


6. Clerical Speed and Accuracy are appraised by comparing one hundred 
combinations of letters and numbers Only two letters, two digits, or one 
digit and one letter are grouped Each item consists of a row of five groups, 
one of w»hich is underlined The task is to underline the same group on the 
answer sheet, for example* 

TestItsms Saiiplb of Ai^ wee Sheet 


7. Language Usage / is a 100-item spelling test in which each word is to 
be judged as right or wrong (10 min ) 

8. Language Uiage II consists of fifty sentences, each divided into five 
parts Errors in grammar, punctuation, or spelling may occur in any or in 
none of the five parts A sentence may have errors in all five parts. (25 min ) 

ExAUFus Sampjub of Answer Sheet 

Ain't we / Eouur to the / oflSce / next week / at alL Hi a o d c 

A BODE I l» }{ ll I 


The scoring for all these tests may be done by hand or by machine from 
templates. The total number of right answers is found for each test and 
then corrected for chance success as follows: 

Tests 1 and 6 . no correction Test 5 * R — 1 /2 W 

Tests2and3*R— 1/4W Tests 4, 7, and 8- R — W 

Percentile norms are provided for each test for grades eight 
through twelve, male and female separately, and for the two forms, 
A and B, The norms are based on scores of pupils in thirty school 
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systems, mostly located in the Northeastern or North Central states. 
In all of these grades the boys' averages surpass the girls' on Space 
Relations, Numerical Ability and Mechanical Reasoning, and the 
differences are larger in the higher grades The girls' averages exceed 
the boys’ on Clerical Speed and Accuracy and Language Usage The 
boys and girls average nearly the same on Verbal Reasoning and 
Abstract Reasoning Total scores on the whole battery are not given. 
All of the tests may be used separately. 

The average reliability coefficients for each test based on split-half 
computations from samples of from one hundred to two hundred 
pupils, are all .87 or higher with the exception of those for Mechani- 
cal Reasoning, which are .85 for boys and .71 for girls. There is a 
slight tendency for the grades of the older pupils to show higher re- 
liabilities. 

In order to discover the independence of the scores, correlations 
for form, giadc, and sex ^\ere computed between each test and all 
the others The results sho^ved correlations of fioni .50 to GO be- 
tween the Verbal Reasoning and all the othci tests except Clerical 
Speed and Accuracy. Numciical Abilit) conelatecl 50 with Veibal 
Reasoning, 54 with Abstiact Reasoning, and 50 w'ltli l.angtiage 
Usage Sentences Abstract Reasoning correlated .56 ivith Space Re- 
lations, j2 with Vcibal Reasoning, and 51 with Mechanical Reason- 
ing The tw'o Language Usage tests coil elated 62 Clerical Speed and 
Accuracy showed coiiclatioiis below 37 w'lth all the other tests With 
large adult groups these correlations would undoubtedly be smaller, 
but there was an unexijccted tendency lor the inteicorrelatioiia to 
increase slightly in the twelfth grade 

Gencial Aptitude Test Battery (GATE; 1947) 

The Gencial Aptitude Teat Battery w^as developed by the United 
States Employment Ser\ice, Washington, D C , and w’as made avail- 
able in 1947 to the vaiious state employment service offices Accord- 
ing to D\oiak (1947) it is intended foi the use of cniployiiient counse- 
lors in appraising the aptitudes of individuals Eleven ol the tests 
use paper and pencil, and wntten and oral directions Their range 
of difficulty makes them applicable to all adults w'ho can lead and 
understand directions in English and handle paper-and-pcncil situa- 
tions W^hilc no age or grade equivalents have been issued, it seems 
probable that the tests w'ould not be applicable to individuals with 
less than filth-grade accomplishments Foui of the tests require 
manipulation ol pegs or small washers and iivets, so that for them 
language is a small factor in understanding the test directions No at- 
tempt was made to have these tests look like w’ork samples, but they 



244 ACHIEVEMENT AND APTITUDE 

are designed to indicate the aptitudes likely 'to be required in suc- 
cessful performance on a large variety of jobs All the tests are timed 
and speed is an important factor, since the tests are so made that very 
few in a group are able to finish in the time allowed. 

In order to prepare this test battery, factor-analysis studies were 
conducted on several experimental batteries, including in all fifty- 
nine tests which had been administered to 2,156 adults. These were 
divided into nine experimental groups. The largest group was com- 
posed of 1,079 subjects, ages from seventeen to thirty-nine years, mean 
age twenty-three, and all had completed at least the sixth grade. The 
average subject had completed the eleventh grade, and 99 per cent 
had completed from 8 to 16 grades Factorial analyses of the results, 
using Thurstone’s centroid method, were applied These showed 
eleven fairly independent factors which are thought to be occupa- 
tionally significant, namely, Verbal, Numerical, Spatial Thinking 
(2 types). Perception (2 types), Dexterity (4 types), and Intelligence, 
The fifteen tests with the heaviest factor loadings and the maximum 
internal consistency were then selected. The aptitudes are measured 
thus: 

V Verbal is measured by a 5-minute test with sixty multiple-choice items 
in which one must identify relationships of same or opposite among four 
words Scientific or technical terms are excluded 

N Nufnencal is measured by a 6-minute test of twenty-five arithmetic 
problems and a 5-minute test of fifty computational problems which do not 
include fractions or decimals. 

S Spatial Thinking is measured by two tests One of these is a 7-minute 
test of forty-nine multiple-choice 2-dimensional problems of rearrangement 
of elements. It is similar to the Minnesota Paper Form Board The other 
test allows 6 minutes for forty problems of 3-dimensional surface develop- 
ment. 

P Form Perception is measured by a 4-minute test of forty items in which 
pictures of objects are to be exactly matched with one of four choices The 
second test allows 5 minutes for 60 items of matching paper figures. It is 
similar to the Minnesota Spatial Relations Test (Ulus 100) 

Q Clerical Peiception is measured by a 6-minute, 150-item, name-compari- 
son test, the subject must indicate whether the names are the same or dif- 
ferent, as in the Minnesota Clerical Test 

A Aiming is measured by a 30-second test of one hundred items which 
requires a pencil line to be placed on the crossbar of an H, % inch high 
and % inch across Aiming is also measured by a 60-second test in which three 
lines are to be made in each of two hundred ^4-inch squares. 

T Motor Speed is measured by the last test described under Aiming and 
also by a 30-second test of placing three dots in each of seventy printed 
boxes The boxes measure by inch and are printed in rows of seven. 

F Finger Dexterity is measured by two tests‘ assembling and disassembling 



GROUP TESTS Of ABILITY 245 

fiflV ijivets and washcis using a strindaid board for holding them The score 
IS the nuinbci assembled in 90 seconds and disassembled in 60 seconds 

Af Manual Dexieiity is measured b\ n\o tests One lequnes the tiansler- 
eiue oi loriy-eight lound 2'*,l»-inth pegs horn one pan ol a board to another. 
The other ie(|Lincs mosing the same jiegs but also turning them end to 
end I'he score is the number ol pegs moved in three periods ol 15 seconds 
each and the number ol pegs moved and tinned in three periods of 50 
seconds e.ich 

G JntcUionuc The autliors lound that a faiih heavy loading of a factor 
was found m all veib.il and number, and in most ol ihe spatial ic'sts This 
factor appears to have some ol the ])roperties ol SpcMi man’s C but it has 
a wider signiluance than Thurstone’s reasoning oi indue non factors It has 
iheielore been given tlie svmbol (1 It is indicated hv a combination ol three 
of the tests included above which showed significant loadings, namcl), the 
\eri)al, the numerical, and the spatial thinking in tlnee dimensions The 
use ol an Index of Intelligence in an aual)tical profile is contioveisial, lor 
the three tests also appear elsewhere in the profile, and the C scoie may 
be fairl) high even when a poison does poorly on one of ihe thiee tests 

Dining the administratron ol the tests, the applicant’s acljiistment 
to the situation is to be recoicled to show nci vousiicss, disabilities, 
copying answers or attempiing to cop\ answers from a neighboi, lack 
of reading ability, writing letters instead ol making check marks, 
and other similar acts which might aflect the scores If the examiner 
leels that the tests are a good sample of the worker’s abilit), they are 
scored Hand stencils aie used The raw scores oi all tests are con- 
verted to a point scale which has the mean at 100 and the standard 
deviation at 20 for the laige general population group 

Each individual’s scores are placed on a profile caid which allows 
a quick comparison with the Occupational Aptitude Patterns (O VP) 
These patterns aie cut-oil scoics ol Irom two Lo four aptitudes, wdiich 
have been determined by applications to groups oi woikers in an 
occupational held Each cut-oil scoic is ihe score w'liich was made by 
aj’ijnoxjmatel) the 3,Hrd centile of an occupational gioup lii other 
w'ords critical scores are given, which divide the lowest third Iioin 
the upper uvo thuds ol an occupational group Foi instance. Oc- 
cupational Aptitude Pattern No 2, Accounting and Related, has 
only two cui-ofT scores, G-130 and N-130, which means that to have 
a reasonable chance oi success in this field, one must score at least 
1% standaid deviations above the mean, that is, among the highest 
7 per cent of the general po 2 ')ularion in both Inlclligeiice and Num- 
ber T he other tests aic disiegarded for this field of woik. 

Occupational Apniucle Pattern 4, All-Round Metal Mnchiumg 
and All-Round Mechanical Repairing, lequiies scores o( at least 100 
for four ajDtitudes (G, N, S, and P), which means that two thuds of 
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the machinists and mechanics who were tested made scores on these 
aptitudes above the averages of the general population. A tentative 
list of twenty occupational aptitude patterns is given. 

During the counseling interview, there is an exploration of those 
OAP’s whose critical scores are met. Usually only the two or three 
OAP's which indicate one’s highest skills are considered. Thus, if 
a client qualified on the OAP 2, 4, 11, and 16, then 11 and, 16 would 
be disregarded because these refer to the lower skills of Routine 
Reporting Work and Simple Visual Inspection It often happens that 
persons who have literary, accounting, mechanical, and clerical apti- 
tudes, also meet the requirements for many kinds of routine inspec- 
tion and assembly work. 

This method of matching individual scores to occupational re- 
quirements avoids an over-all score, and aids in rapid interpretation 
Although the GATE was standardized on adult workers* it will un- 
doubtedly be applied also, after further standardization, to groups 
in high schools and colleges. 

The Yale Aptitude Battery 

Crawford and Burnham (1946) issued an intensive discussion of 
forecasting college achievement, and described a battery of seven 
tests, each printed in a separate booklet with a liberal time allowance 
— approximately 50 minutes All except the first test were developed 
by the authors at Yale. Tests I, II, and III of this battery were signifi- 
cant in predicting success in liberal arts studies, tests III, IV, and V 
in pure science and mathematics, and tests V, VI, and VII in the ap- 
plied sciences, for example, engineering. The battery is composed of: 

1. The verbal section of the College Entrance Board Scholastic Aptitude 
Test IS to a large extent a measure of literary vocabulary. 

2. The Aitificial Language test requires one to rapidly learn eight new 
words and a prefix, to indicate the future tense, and then to translate short 
statements into English. 

3 The Verbal Reasoning test is one m which the person being tested 
reads a paragraph and draws conclusions or interpretations based on the 
paragraph. For each condusion one marks one of five levels of probability 
that It IS true 

4 The Quantitative Reasoning test uses algebra and number-senes com- 
pletion 

5 The Mathematical Ingenuity test involves skill m solving algebraic 
equations and making geometric statements algebraically 

6. The Spatial Relations test requires one to look at pictures of piles of 
cubes, and to determine how many cubes have one, two, three, four, or five 
sides painted, assuming that all sides are painted which do not touch another 
cube or the surface upon which they are resting. 
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7 riie Medi finical Ingenuity test \\hich directs one to look at diagrams 
of gears pullers, and forms to detcirnnic* lelaiive mo\ements and stabiht)-. 

Cl awl Old and Burnham did not offer any figures to show the fac- 
toiial purity or homogeneity of each test, but they published an 
intcrcorrelation matrix based on 850 Yale licshmeii. The median of 
these conelatioiis is -11, and thiec of them arc above GO Hence the 
unicfucness oi independence oJ sonic of these tests is not as great as 
is desirable Tests 1 and III, Vocabulaiy and Veibal Reasoning, cor- 
lelaied G-1, tests IV and V, boih ol which use algebra extensively, 
coiielatcd G2, and tests TV and VIT, Quantitatuc Reasoning and 
Mechanical Ingcnuits, con dated Gl with each other In spire ol these 
coriclations the authors claim fairly good specificity of prediction. 

California Tests of Mental Maturity 

The California Tests of Mental Maturity were published by Sul- 
livan, Claik. and 1 legs (1937) in foui battciics, one for kindergarten 
and (hit grade, and the others lor first to third grades, fourth to eighth, 
and ninth to loiiitecnth. In each battery sixteen subjects arc distrib- 
uted among five sections 'J he hist section is designed to detect gross 
visual, hearing, and motor handicaps 'rhe second section contains 
one UP mediate recall and one delayed recall test, the tliird, three tests 
imohing spatial relationships, the louith, seven tests of verbal and 
numei ical leasoning, and the filth, a 50-itcm multiple-choice vocabu- 
lary test The test nouns allow' one to draw a profile (Jlhis 87) show'- 
jng sixteen separate scores as W'ell as total scores for verbal and non- 
verbal factors and lor ilie whole test. The reliabilities for the sub- 
tests range from 70 to 95 and foi the totals from 90 to .96. The total 
teat requires »iljont 90 minutes. It is one ol the most extensive of its 
kind, and the profile oi scoiea presents a picture of skills which have 
been found to be somewhat independent. 

PRACTICAL RESULTS 

Only a few lepoits from among many hundreds can be cited here. 
Most of them show^ progress and promising possibilities for both 
academic and vocational predictions. 

Age and Sex JlilTcrcnccs 

Age diffcicnccs aic noted by all authors of scales for school popu- 
lations The aveiage diftciences betw^een ad]acent age groups shown 
for the SRA Piimaiy Mental Abilities are about ecpial in raw’-scoic 
points for each year Iroin eleven to seventeen Smaller-than-avci age 
differences are shown below the 25th centile, and also above the 90th 
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ILLUS 87 CALIFORNIA MENTAL MATURITY TEST PROFILE 
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CALIFORNIA TEST OF MENTAL MATURfTY—ELEM ENTARY BATTERY 

DewMd by Elizabeth T Sutlivan, WiIIis W Gark, and Ernest W Tte{s 

Name WiUiam SpiU^ Grade C5 l 7) Boy-Cirl 

School. . . lancoln Age.|l Last Birthday^^ec. 14 

Teacher Date 
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(Copyright, 1937, by E T. Sullivan, W W. aark, and E W. Tiegs By per- 
mission of the Southern California School Book Depository, Los Angeles, 
California, and the authors.) 
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centile In the latter case the dilTei cnees arc probablv due to a low 
ceiling lor the tests The same age patterns aie seen in the Diliereiitial 
Aptitude '1 csts (DAT) 

'Ihc sex difleicnces are probably significant at all ages, but be- 
come somewhat laigcr among adults The use of dillcrent norms 
loi boss and giils is desirable when boss are competing svith bo>s 
only, and giiJs with giils only II both bo)^ and giiis are taking the 
same couise or applying lor the same |ob, howeser they should be 
compared on the same basis Afost ol the anthois base not liirniahed 
nouns loi each sex separately, but the autliors oi D VT have done so. 
Theie are no significant diflercnces in Vcibal or Abstract Reason- 
ing The bovs score higher in Vumber, Space Relations, and Me- 
chanical Reasoning, and the girls score highci in Clci ical Speed and 
Accuracy and Language Usage. Thus, a twellth grade boy s\ho is 
at the 80th centile in spelling among boys would be only at the 58th 
centile among girls, and a girl at the 05th centile in Mechanical 
Reasoning among girls would rank at the I8th centile among boys. 

The icliability and validity ol a test may \ary a good deal when 
applied to dilleicnt sexes, hence these cliflciences must be carelully 
cxploied and reported 

Prediction of Academic Achievement 

The criteria of academic achievement are usually the grades re- 
ceived in a course oL studv, or the average grade in a gioup of courses. 
Grades aie the re:>Lilt of many complex interactions ol ability, meth- 
ods ol instruction, motivation, outside distractions, various standards 
of grading, and othci factors Furthcimore, giadcs aie usually ex- 
pressed on a 5-point scale- -A B, C, D, and Jh! — ^^nth lit lie or no at- 
tempt to have equal steps in this scale Many times so few E’a or A’s 
are given that ihc scale is reduced, in eflect, to 3 or 4 points Such 
roughness m giading complex pioccsses doul'itlcss reduces the ac- 
curacy and hcncc the reliability ol the giades. Analytical studies of 
academic grades, vshicli show the impoitant iactois in success for 
paiucuJar gioiips in a paiiicular couise, have been outlined, and 
much progress has been made in defining goals ol achievement in 
school Iwo excellent summaiies ol this jirogress aic lound in a re- 
port by Smith and Tyler (1942) loi the Coinmiltec on Evaluation ol 
the Piogicssive EducaLioii Associaiion, and in ihe Fo^ty-Fijih Ycm- 
hook of the \ati07ial Society for the Study of Education (1946) How- 
ever, no reports have come to hand vshicli clcaily show the i elation- 
ships between academic achievement and the moic complex social 
and iiitelleclLial goals 

A good many writers have pointed out that unless group tests aic 
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used carefully they may lead to harmful judgments regarding both 
children and adults. D. A Wooster (1947) reports a twelve-year-old 
boy in the fifth grade who had obtained a Henmon-Nelson Test of 
Mental Abilities IQ of 53. It had been assumed that his low mental- 
ability score was a result of poor intelligence and that this in turn 
had retarded his progress in learning to read. An Iowa Silent Read- 
ing Test yielded a score of 3 5 grades, but there were also successes 
on this test far above tlie sixth-gi^ade level An individual Stanford- 
Bmet Test Form L gave this boy an IQ of 78 and showed that his 
language ability was about that of a ten-year-old boy Later a Pat- 
erson short form showed a Performance Quotient (PC^ of 98 While 
these tests are not supposed to be entirely equivalent, the differences 
are far greater than the normal variations between tests. When the 
boy w’as asked to read portions of the Iowa Silent Reading Test it 
became apparent that he could scarcely read. He stumbled over the 
simplest words and was exceedingly slow in reading those which he 
did know. It appeared that he had followed instructions and put 
marks in certain spaces and by pure chance had attained scores which 
gave him a much higher rating tlian his true abilities warranted. The 
boy's social background and job experience revealed some good 
reasons for his poor language ability Wooster concludes. 

It is apparent that many school people have not been trained to the 
point of realizing that the choice of the mental-ability test appropriate for 
a given individual must be based upon the knowledge of the circumstances 
surrounding his case The absurdity of giving a mental test involving read- 
ing to one who is deficient in reading and then concluding that his mentality 
is low is probably fairly common. Great harm is likely to result from such 
practice, but it seems that precaution must be given again and again. It 
should be stated categorically that no group test of any kind should be used 
unless there is provision for intensive individual study of those persons mak- 
ing those scores 

Here again the advantages of an analytical profile test become ap- 
parent. 

Prediction of One-Semester Grades, One of the most careful 
studies of prediction of academic success from aptitude tests is that 
of Crawford and Burnham (1946), who used a battery of seven tests 
administered at the beginning of the freshman year. For example, 
predictions of college grades at the end of the first term for the class 
of 1944 are shown in Ulus. 88, Each of the aptitude tests is shown to 
predict academic success well, that is from .42 to .57, in only the cor- 
responding type of course. These results are highly desirable, for they 
will make possible more specific predictions of success than could 
be made from a general measure. The authors conclude that T scores 



GROUP TESTS OF ABILITY 


251 


S - - 


Sil -s 

CM Cv| o 


?• 

^ S 
« P=i W 
< 


III 

•< 



252 


ACHIEVEMENT AND APTITUDE 


of 60 or higher indicate positive aptitudes which should be encour- 
aged; while those under 40 are ''red stop-signals” for particular fields, 
unless there are unusual circumstances. 

Prediction of Four-Semester Grades. Goodman (1944) summa- 
rized seven reports in which Thurstone's Primary Abilities Test re- 
sults had been compared with college grades for various groups. One 
of the most interesting was a comparison of two studies of a group of 
113 women in the Home Economics Department of Pennsylvania 
State College, one at the end of the first semester by Virginia D. 
Tredick and the other at the end of the fourth semester by Elizabeth 
W White Miss White averaged the grades for all courses taken in 
a subject over a 2-year period and used this average as the academic 
criterion of success in the subject. By the end of the second year only 
94 women were available for the study. Presumably this resulted in 
some loss in range of ability of the group. 

Illustration 89 shows that the correlations between Reasoning- 
Ability scores and the criteria of academic success were all slightly 
higher after four semesters than after one Verbal-Meaning scores 
correlated with English grades .55 after one semester, and .65 after 
four semesters In general the changes are small and probably in- 
significant. For Art and English the average predictions increased 
as time went on; for Home Economics and Point Averages, the 
average predictions decreased somewhat; for Science grades, the pre- 
dictions were nearly the same 

ILLUS 89 CORRELATIONS OF PRIMARY ABILITIES WITH ACADEMIC 
SUCCESS AMONG HOME ECONOMICS MAJORS 

Primary Home Point 


Ability 

Art 

Science 

English 

Economics 

Average 


1 * 

4* 

1 

4 

1 

4 

1 

4 

1 

4 

Perception 

.15 

.13 

20 

18 

.19 

20 

31 

11 

.28 

.19 

Number 

11 

13 

.46 

.44 

22 

.28 

20 

.17 

41 

33 

Verbal 

24 

29 

28 

33 

.55 

,66 

50 

.32 

51 

.49 

Spatial 

25 

28 

.23 

20 

.10 

.14 

22 

.10 

28 

.19 

Memory 

— 02 

11 

.25 

28 

08 

26 

.12 

02 

20 

20 

Induction 

.26 

26 

37 

23 

.19 

.18 

.35 

14 

40 

.24 

Reasoning 

.21 

.30 

.43 

49 

.21 

30 

24 

36 

42 

.45 


* After one or four semesters 

(By permission of Goodman (1944) and the editors of Educational and Psycho- 
logical Measurement ) 


More research is needed to determine the reasons for these changes, 
which may be due to changes in grading, course content, or the stu- 
dents included in the study One important finding is that as a whole 
the predictions of success in specific subjects were as high, or higher. 
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after four semesters as after one semester, while the point averages, 
which mix all subjects together in unknown combinations, tended to 
go down The point a\ciages probably lepreseiit more heteiogcncous 
scores alter four semesfeis than alter one 

Piedtftwn of Success in Field of Specialization To what extent 
can measures ol primary abilities predict success in professional 
studies? No direct follow-up studies have come to hand, but Adkins 
(1940) repoited average ability profiles ol graduate students from ten 
universities in tiselve professional fields She noted fairly distinctive 
average profiles for (1) chemistry and mathematics, wheic the high- 
est scores were in Number (N), Verbal Meaning (V), Spatial Think- 
ing (S), Induction (I), and Deduction (D), (2) physics and engineer- 
ing: N, S, and D, (3) accounting, business adimmstration and jihar- 
macy; N; and (4) medicine very superior throughout, but slightly 
higher in D, V, and S. She emphasized the overlapping of distribu- 
tions of scores from the various fields of specialization and the need 
for measures which would discriminate more effectively between 
these fields 

Another report by Stuit and Hudson (1912) gave profiles for groups 
of students in engineering, journalism, and medicine similar to 
those found by Adkins. '1 hese authors also compared the Primary- 
Ability scores with grade averages and found correlations ol from 
—.219 to ">77 The corielations in this case w’cre lowered by the fact 
that only high-ranking students weie included in these groups Thus, 
among engineers, the Sj>atial ability correlated only 178 while the 
Verbal ability corielated 577 with grade averages Most of the engi- 
neers made such high scores on the Spatial tests that the test probably 
failed to distinguish their relative abilities m this factor. This points 
to the need for more difficult tests of primary abilities w-hen only 
the highest 8 or 10 per cent of the population is to be measured 

Other important studies of prediction of the results of military 
training are described in Chapter XI Careful studies of prediction 
of grades in elementary or high school fiom academic achievement 
tests are discussed in Cliapter VIL 

Prediction of Vocational Success 

Satisfactoiy criteiia of vocational success are exceedingly hard to 
find and usually rather \ague Until more analytical approaches 
are made to both job success and w'orkers’ charart eristics, predictions 
of success w’lll be, as they are at present, rather sketchy. In a lew 
situations correlations as high as 65 have been leporled betw^een 
production records or ratings and a weighted combination of two 
or three tests on small samples. Howevei, most studies repoi t smaller 
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correlations of from .10 to 35, which are significant only for a rough 
screening of applicants. No thorough report of the application of a 
battery of primary-ability tests to an occupational group has yet 
come to hand, but Dvorak (1935) found wide variations in average 
scores of groups of janitors, policemen, garage mechanics, orna- 
mental-iron workers, nurses, saleswomen, and women office clerks, 
when he compared four groups of tests as shown in Ulus. 132 These 
tests seem to sample four primary abilities fairly well The Pressey 
Senior Classification Test is heavily loaded with verbal meaning. 
The Minnesota Clerical Tests measure speed of perception with 
words and numbers. The dexterity tests involve speed of hand-and- 
eye coordination, and the mechanical-ability tests involve familiarity 
with small gadgets or tools, and spatial comparisons of size, shape, 
and position. This illustration shows that there are large differences 
between retail saleswomen and the groups of nurses and clerks, and 
that the nurses average a little higher than the clerks on the Pressey 
Senior Glassification Test. On speed-of-perception and dexterity tests, 
however, the clerks are considerably ahead of the nurses, while the 
groups are nearly the same on the mechanical-ability tests. The 
saleswomen are a little above the nurses in finger dexterity but the 
same in tweezer dexterity. Dvorak also gave figures to show the over- 
lapping of scores of occupational groups Thus, on the Pressey Senior 
Classification Test it was found that 92.9 per cent of the clerks 
reached or exceeded the median score of the saleswomen. This is 
shown graphically in Ulus, 90, where it is apparent that the upper 
half of the saleswomen had scores similar to those of the lower half 
of the clerks One test of this type, therefore, did not distinguish 


ILLUS. 90 SCORE ON PRESSEY CLASSIFICATION TEST 



(By permission of the University of Minnesota Press.) 
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well between the two groups, but could be used for a rough screen- 
ing The whole profile, on the other hand, was found by Dvorak 
to be significant, for she took 158 individual profiles from the files 
at random, 90 clerks and 68 saleswomen, and mixed them together. 
Then, solely on the basis of the profiles and the norms shown in 
Illus. 132 and Ulus. 90, the individuals were divided into two groups 
by an assistant This resulted m the correct classification of 92 4 per 
cent of the workers; 5.1 per cent were doubtful, and 2.5 per cent 
were incorrectly classified While it would doubtless be true that in 
a large random sample of employed women there would be more 
doubtful or incorrectly classified cases, still the typical patterns are 
significant 

This point brings up the question, how many of the tests in a bat- 
tery are useful for the selection or promotion of particular groups^ 
If the occupation in question makes little or no use of a particular 
ability, or if the less able workers do as well in it as the more able, it 
has been argued that the test for that ability should be omitted. 
Most of the present reports from industry show that only a few tests, 
which appealed from a job analysis to be most appiopiiate, have 
been used m any suidy Howe\er, there js considerable anecdotal 
evidence that pool woik-ad]ustincnts aie oltcn the result oi lack ot 
oppoitunity to use the skills or aptitudes that a person has I'hus, 
a woman wnrh a maikcd artistic skill may have the reqmied abilities 
for clerical w'ork, but might not do well at such work In order to 
avoid tiaining and placing peisoiis on jobs where they wnll not be 
satisfied, a laiily complete piofilc ol abilities and knowledge would 
be most effeeme AVith the analytical tests which ate now a\ailable, 
such applications will be much more Irequent 

In Older to he ol greatest value the tests must be applied to groups 
in naming or without much cxpciience, and then e\aluated seveial 
months oi years later from criicna ol success on the job This pioce- 
duie takes time 

Prediction of Intelligence 

An impoitant siudy of the picclirtion of intelligence at college 
entiancr Jrom earliei tests was repoited by R D Thorndike (1947). 
The verbal scoies on the College Lntiance Board Scholastic Aptitude 
Test given in the twclltli giaclc were taken as the rermiiial ciiteiia 
ol intelligence, because they have been issued annually in a w^ell- 
stanclardi7ed lorm since 1924 About ten thousand lecoids oC pupils 
from public and private secondary schools near New' Yoik City 
wcic located, and live thousand of these w'ere selected lor analysis 
because they seemed complete enough to be significant Thorndike 



256 ACHIEVEMENT AND APTITUDE 

found that the prediction of the terminal-test score was about the 
same for all tests given at any time during the senior high school 
period. Thus, for the Otis S-A Higher Test the correlations were 
.64 when it was given in the same year as the terminal test, 65 when 
given the previous year, ,62 after a 2-year interval, and .65 after a 
3-year interval. Similar figures for the Terman Group Test were 
.82, .81, .77, and 69, and for the verbal score on the American Coun- 
cil Psychological Examination, .70, 70, .73, and .69. The results in- 
dicate that any of these tests when given in the ninth grade predicted 
verbal comprehension 3 years later as well as did the same test when 
given in the tenth, eleventh, or twelfth grade. This fact is very 
significant for counseling, because such data are more valuable in 
the earlier than in the later stages of the student’s development. 
Tests given in the seventh and eighth grades showed somewhat poorer 
predictions but still significant general trends (median r about .60) 
Grades four, five, and six yielded predictions from Stanford-Binet 
Tests of approximately .59 on small samples, and the lower grades, 
approximately .40 

Predictions were doubtless reduced in part by the differences in 
functions measured at different ages and by the different tests The 
Verbal Score of the C.E.B. Scholastic Aptitude (terminal test) is a 
fairly pure test of verbal meaning and relationship. Other tests stress 
this factor but also include unknown amounts of mathematical, 
spatial, and other types of content. The only way to avoid this dif- 
ficulty is to use a battery of factorially pure tests. Predictions were 
also probably reduced by the limited sample of pupils available. 
All these figures probably come from persons in the highest quarter 
of the total population, and it is likely that more than half of them 
fall in the highest 5 per cent. A further selection was probably made 
in the case of the Stanford-Binet Tests, because these tests are usually 
given only to pupils who need special attention. 

COMPARISON OF SCALES AND NEEDED RESEARCH 
Purposes and Coverage 

There are two different purposes for preparing analytical bat- 
teries. One, typical of Thurstone, Guilford, and the GATE and the 
AAF Tests (Chapter XI), is to determine and measure primary abili- 
ties as basic research tools, the other, more typical of the United States 
Navy Tests, the Yale Battery, and the Differential Aptitude Tests, 
is to predict academic or vocational success more accurately than can 
be done with a single index. The first purpose puts greater emphasis 
on mathematical analyses of correlations and the purity of test items. 
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The second stresses the inclusion of tests which yield the best pre- 
dictions of success. The purity or independence of the tests is a 
secondary consideration Both tend to avoid items indicative of 
purely academic achievement, interest, or personal adjustment Both 
approaches have resulted in batteries which are intended to meas- 
ure practically the same skills 

In order to compare the scales described above, the writer has 
prepared Ulus. 91. Eight mam groups of factors are used, which in- 
clude five areas of skill: language, number, spatial thinking, me- 
chanical principles, and dexterity. The three other main groups, 
perception, learning, and reasoning, are ways of thinking that may be 
applied to any of the five areas 

Since the eight chief groups have been subdivided and given some- 
what technical definitions, they are described below. A remarkably 
thorough and clearly documented discussion of these is given by 
Guilford and Lacey (1947), from which this summary is largely drawn. 

Language, This group measures the various aspects of the use of 
words as symbols. There appear three well-defined subgroups (a) 
word meaning; (b) word fluency, which involves the rapid recall or 
use of previously learned words, and (c) written symbols or usage, 
which includes spelling, grammar, and punctuation. Further sub- 
divisions of word meaning have been demonstrated by Greene (1939), 
who prepared eight fairly independent vocabulary scales, tentatively 
named, social, commercial, government, physical science, biological 
science, mathematics, graphic arts, and sports 

Number, This group of tests involves the use of numbers for 
computation, and is measured in its purest form by tests of addition, 
subtraction, multiplication, and division. Many authors also include 
arithmetical problems here, but these are always found to have high 
loadings of reasoning and some verbal factors. Geometrical prob- 
lems also involve some spatial factors. Algebraic problems were 
found to have high components of general reasoning 

Spatial Thinking, This group has a core of visualizing or esti- 
mating what would happen if 2- or 3-dimensional figures were moved, 
rotated, or unfolded in some way Guilford’s Orientation, Test V, 
requires one to identify the direction of movement of oneself if one 
were in a pictured series Guilford and Lacey list three spatial factors 
(1) an ''order of relationship between objects,” (2) a right-left dis- 
crimination, and (3) a spatial factor, which as yet is difficult to name. 
In addition, they describe visualization as a visual-manipulative abil- 
ity. 

Mechanical Principles, This group of tests requires one to apply 
knowledge of mechanical principles or elementary physics to pictured 
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situations which involve weight, force, heat, and light. Two fairly 
diflferent patterns are usually mixed One pattern involves knowl- 
edge of materials and tools; and the other calls for a reasoning proc- 
ess when it is necessary to draw inferences and apply rules 

Dexterity • All of these tests involve hand-and-eye coordination, 
and four fairly independent factors appear: aiming, fine manipula- 
tion, coarser manipulation, and a simple, fast movement, such as 
tapping, which requires little accuracy 

Perception Perception loading is found in all tests, but m this 
group of tests of perception other factors are minimized by requiring 
only simple judgments of like or different. Perception may, of course, 
depend on the keenness of a sense organ, but in these batteries only 
large visual patterns are used, so that unless a person has very defec- 
tive vision, the test is a fairly pure measure of speed of mental com- 
parisons. Familiarity with words or forms makes a diflEerence, sta- 
tistically and probably vocationally, but the correlations are usually 
fairly high between various visual-perception tests. None of these 
batteries includes perception of sound, but the Radio Code of the 
United States Navy and the Seashore Musical Tests include evalua- 
tions of this important factor 

Guilford and Lacey also describe two mental-set factors: one is 
the ability to keep up with rapidly changing instructions and the 
other the ability to grasp a wide variety of tasks requiring speed and 
accuracy. 

Learning. All the tests in this group require one to make new 
associations during the test. Rote learning or immediate memory is 
well isolated in Thurstone’s tests of associating first and last names 
or words and numbers. More logical learning is illustrated by the 
Artificial Language Test used by Crawford. Most of the scales do not 
include tests of learning, although learning ma} have considerable 
significance lor certain vocaiion^ Guilloid and JLatc) list lour mem- 
oiv lactois coiieiponding to iniincdiaic lecall of pairs, recognition 
of pictorial maiciial, recall ol a picture-symbol relationship, and 
memory ol \erbal instructions 

Reasoning or Problem Solving Under this gioup aie considered 
Lesis ol reasoning, judgment, loresight, planning — all tests which le- 
qiure one to formulate general rules Irom obserired plienoincna, or 
to apply a rule in solving a pioblem All ol these tests usually show 
signihrant factors in peiccpnial speed Other factoi's such as num- 
ber, \eibal, spatial relations and mechanical cxpciicnce, are louiid 
in tests where particular symbols are used 

In order to secure compaiable measures ol reasoning, the kno^vl- 
edge oi infoimation background must be kept constant llicic are 



GROUP TESTS OF ABILITY 261 

two possible procedures,' neither one of which has yet proved piac- 
tical. One procedure would be to try to prepare tests using symbols 
that arc equally Lamiliar to all candidates No one has yet found 
such symbols, although some simple language or nonlanguage situa- 
tions may be usetul. The other procedure would be to gcr tuo meas- 
ures — one oi knowledge, the other ol knowledge and reasoning — 
and then to remove statistically the knowledge variance lioiii the 
reasoning scores. Much reseaich is needed here In inaiiv practical 
situations freedom liom emotional stress is an important part oE 
piolilem solution This, too, should be mcasined sepaiaiely by secur- 
ing scores under sticss and nonstress situations 

In good problem solving the follow ing four fairly distinct activities 
ha\e been obserted (a) grasping a whole situation so as to define the 
problem, not being disturbed by minor or imimpoi tarit details, {b) 
being aware of reasonable solutions or hypotheses, (c) trying out 
solutions quickly, mentally or with objects, and (d) selecting and ap- 
23lying one of the best solutions 

These lour activities may occur fairly sei')arately, but they usually 
seem to interact upon one another in life situations All available 
factorial analyses ol complex-reasoning tests have thus far yielded 
lather vaguely defined factors which seem to be related to combina- 
tions of mental activities and emotional adjustments Various in- 
dividuals uiidoubtedlv use diftcrcnt combinations in arriving at the 
same score in a reasoning test. When a single score has several dif- 
ferent meanings, vague factorial results w’lll alw’ays follow 

No one who has w’orked in this field claims that these are all or 
even the pimcipal abiliucs underlying human behavior. Nearly 
every factorial analysis yields statistical factors which are hard to 
identify, but w’hich probably reflect variance in such aspects as 
energy output, speed, acquaintance with test situations, and dis- 
tractions Furthermore, there are large fields of skills involving sound, 
color, and physiological and structural variations w’hicli have not 
yet been exjjlored by this technique. 

Item Analysis: Form, Content, Number 

An inspection of these scales will show that nearly all the items 
are m multijile-choice form, except those in the dexterity tests w'here 
the number ol moves in a given time arc counted The multiple- 
choice lor in allows rapid scoring without the need of corrections for 
chance m most cases. 1 he language used is made siiiqjlc cxccjdi where 
this variable is being tested. 

Two arrangements of items are found One, for power tests, places 
the Items according to measured difficulty, the easici ones first The 
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other, for speed tests, as in perceptual speed, assumes that the items 
are of nearly equal difficulty and hence places them in random order. 

The content of individual items has been determined by the con- 
sensus of judges In some instances this seems sufficient — for instance, 
in selecting items for simple computation, perception of form, block 
counting, and surface development it is not difficult to prepare very 
similar items. In other cases, however, the items in one test probably 
vary a good deal in content or processes involved Thus vocabulary, 
w'ord fluency, reading, analogies, arithmetic problems, and the var- 
ious reasoning tests probably include items which are not pure meas- 
ures of any one factor. Item analyses by intercorrelations are much 
needed here, and may well result in new types of items. The prepara- 
tion of Items has been channeled into types which were fairly com- 
mon more than 20 years ago, while many newer types have scarcely 
been tried out. This sterility has probably come from laziness, tradi- 
tion, and a desire to develop something which correlates highly with 
some well-established scale — all of which prevent progress 

The number of items used varies somewhat between authors and 
according to the subject matter. Thus, verbal meaning is appraised 
by from 50 to 70 items, number computation by 50 to 70, arithmeti- 
cal problems 24 to 30 items, spatial thinking about 40 items, me- 
chanical principles 40 to 60 items, dexterity about 100 items for each 
sort, perception 70 to 100 items, learning 20 to 30 items, and reason- 
ing 27 to 50 Items. Guilford-Zimmerman, the Yale Battery, and the 
DAT use the largest number of items, and also items which take 
longer to finish, so that their working time is from two to four times 
as long as that of the Chicago Primary Abilities or the General Apti- 
tude Test Battery The Yale Battery is also long, and Crawford and 
Burnham have emphasized tliat the speed requirement probably in- 
troduces a factor in the test situation which should be isolated and 
for some purposes avoided. In clinical practice where the patient is 
distractible, these tests have little value, but similar tests could and 
doubtless will be prepared for clinical situations 

STUDY GUIDE QUESTIONS 

1. What were the principal early developments in group mental tests? 

2 What criteria were used m selecting tests for the 1917 Army Alpha 
Test’ 

3 To what extent was speed a factor in the Army Alpha and Beta Tests? 

4 To what extent do 2-factor and general tests differ in content? 

5, What characteristics must a culture-free test have? 

6, What IS the design and purpose of the American Council on Educa- 
tion Psychological Examination? 
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7 Hou does the Calilornia Test of ^^enta.l Matuiity arrive at scores 
for lea-iOnnig, Linguagc, and number- 

8 How can deter loiation in iiiLellei tual cflicicncy be nieasiiied^ 

9 Whv has Tactoiial punt\ of a lest become a major consideration for 
some authors^ 

10 How IS lactonal purity demoiistr alexP 

11 How are primary abilUK's defined-’ ITow are the) measured? 

12 1.0 wlial extent arc speed and power important in tiie Tlmr^tone and 
Gin Hold b.itLciies^ 

13 "Wlrat IS the significance of the general intelligence score of the 
General Aptitude fesr Rattcrv’ 

11 llow’ are Octiipatioii.il Aptitude Pattern^ established from the Gen- 
eral \ptiLudcTesl battery^ 

1 ■) \V hat is the magnitude of age differ eiiccs on most of these scales^ 

10 What magnitude of sex differences is usual 1\ founcP 

17 '\\ hat e^ idence is th(*ie ol harmful results of careless use of tests’ 

18 What predictions liase been ionnd for one semester’ roui semesters? 

19 How significantly do fields ol specialization show expected profiles? 

20 Wh) did Stint and Hudson (1942) find such low coiielations betw'een 
grade averages and spatial abihi) among engineers’ 

21 How did Dsorak match individual ancl group profiles’ 

22 What are the main differences in content shown in Ulus 9P 
2*1 What arc the principal components of the language factor’ 

24 Wliat IS the nature of each of the eight major groups of factors’ 

25 Wffrat v.n leiies of leaining are testecl’ 

2G "What are the most cffectise measures of reasoning’ 

27 "Wliat types of factors aie not yet well explored’ 

28 llow' may the factorial purity of individual items be discos cred? 

29 flow can the optimum numl)cr of items needed be discovered’ 



CHAPTER IX 


MECHANICAL AND 
MOTOR TESTS 




In this chapter certain abilities involved in motor and mechanical 
skills are defined, and their evaluation according to standard tests is 
discussed, Paper-and-pencil tests of knowledge and reasoning about 
tools, objects, and forces are described. Then hand-and-eye coordina- 
tion, strength, reaction time, and problem solving are considered. 
Lastly, attempts to describe and analyze basic mechanical factors are 
reviewed. 


PAPER-AND-PENCIL TESTS 


Knowledge 

Tests of knowledge are widely used in appraising achievement in 
apprentice and shop training, and in predicting success in future 
training or employment. Testing knowledge alone dr' knowledge 
coupled with mechanical reasoning has been found to be about the 
best single way to predict success both in training and on the job. 
When such tests are developed for one trade exclusively, they are 
called trade tests. Tests of the printed and oral types are described 
below. 

Printed Tests of Mechanical Information. This type of test is 
illustrated by such multiple-choice items concerning woodworking 
as are given in Ulus. 92. The items are taken from an elaborate study 
of mechanical ability by Paterson et al. (1930), which included ap- 
praisals of knowledge in each of the following: woodworking, print- 
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ILLUS 92 MANtTAL TRAINING INFORMATION TESTS 
Minneapolis Public Schools 

Test A. Woodwork 

Devised by Manual Training Department Committee on Objectifying Grades 
on Woodwork, assisted by the Mechanical Abihties Research Stajff, University of 
Mmnesota 


Name . . School . 

Date of birth ... ... Grade 

day month year 

Underscore shop courses you have taken Sheet Metal, Mechanical Drawing, 
Electncity, Woodwork, Prmtmg 


This is a test to see how much you know about Woodwork 

Here is a sample question abeady worked out. Notice how it is done. 

St Paul IS the capital of 1 Ohio 2 Vermont 3. Minnesota ... ( 3 ) 

The right answer is Minnesota, so Minnesota is underlined Notice also that 
the word himnesota is No 3, so 3 is written in the parenthesis at the right hand 
side of the page \ ou arc not expected to answei all the questions below Answer 
just as many of them as you can 

Underhne only one w ord in each t ase and be sure lo put the number of that w ord 
in the parenthesis at the right-hand side of the page 

Stop Wait for signal before beginning. 

Test begins here 

1. For holding pieces of wood together while gluing, use 

1 clamps 2 screws. 3. boards. 4 nails . ... ( ) 

2 Holes are bored w ith 

1. a brace and bit. 2 an awd 3. a planer 4. a chisel ( ) 

3, A jack plane is made of 

1. steel 2 copper. 3 celluloid 4 glucose ( ) 

4. A part of a brace is the 

1 chuck 2 point 3 break 4 knife ( ) 

5, The marking gauge is used in 

1 la3nng out widths 2 cutting small boards. 3 making 

joints 1 ’gilt 4 bonrg holes . - ( ) 

6. When sanding flat surfaces, llie sandpaper should be backed with a 

1 flat block 2 w ooden plane 3 cylinder 4 piece of 

pumice stone . • ( ) 

(and 132 7hore items) 

(From Paterson et al 1930, p 150. By permission of the 
Univeisity of Mmnesota Press ) 
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ing, mechanical drawing, sheet-metal working, and electricity. A 
large number and variety of examinations of this kind are prepared 
by teachers of mechanical subjects in high school and college and by 
civil service examiners The Purdue Tests of Machine Shop Practice 
and of Electricity are good examples of widely standardized tests. 

Pictorial tests of mechanical information resemble those shown in 
Illustrations 93 and 14 In Ulus. 93 the subject is to choose the one of 
five phrases that gives the name and use of a small pictured object. 
In lllus 14 the parts of a lathe are to be labeled and their functions 
explained. 

Oral Tiade Tests. In 1917 many oral tests of trade knowledge 
were prepared for the United States Army for a rapid, rough screen- 
ing of a large number of soldiers. Each test included from fifteen to 
twenty-five specific questions of definition or procedure which had 
been found to distinguish between apprentices, journeymen, and ex- 
perts in a particular trade. In 1940 the United States Employment 
Service developed many more oral trade questions, and brought them 
up to date for almost two hundred craft occupations. These tests 
have not been made available to private employers, in the hope of 
preventing their use by coaching schools. Similar written and oral 
tests, however, have been prepared in industry, where a large num- 
ber of workers have to be screened rapidly. In general, the written 
tests are more extensive, more intensive, and more valid. Many em- 
ployment interviewers have adapted items from both oral and writ- 
ten trade tests for use in particular types of interviews. 

Mechanical Principles 

Thurstone (1938) defined Mechanical Reasoning as one of the 
primary mental abilities, and measured it by printed pictorial tests, 
such as that shown in Ulus. 147. Bennett's Test of Mechanical Com- 
prehension (1940, 1947) includes problems of heat and light, as well 
as of forces of various sorts. More recently, this type of test has been 
greatly expanded and widely applied by the military services (Chap- 
ter XI) and has been incorporated into most of the batteries of basic 
abilities (Chapter VIII). 

Spatial Visualization 

Tests of ability to observe, compare, and visualize geometric shapes 
or patterns appeared in the United States Army Beta Test of Cube 
Counting in 1917, and have since been expanded. Illustration 94 
shows a widely used paper form board for which additional norms 
were issued in 1948 by the Psychological Corporation. Thurstone and 
others have found Spatial Visualization to be an aptitude essential for 
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ILLUS 93 SRA 'MECHANICAL APTITUDES. FORM AH 
MECHANICAL KNOWLEDGE 

PUCTICi IXiRCISES 

How much <3o you know about tools, machines, and other equipment used hf 
carpenters, plumbers, electricians, gardeners, machinists, auto mechanics, house* 
wives, and others who work with mechanical devices^ Medumical knotekdge is 
important for success in mechanical activiUes This first test measures your 
information about mechanical devices 


Look at the problem below* 


Pt 



A. chop vmod 

B aciapepcunt • 
This IS used to c remove nouls 

n shape metal * 

B. break rocks 


The picture shows a hand axe, which is used to chop wood An X has been 
marked in the box after chop wood 

Now work the probkms below In e<.(h problem, pu* an X in the box a^'ier the 
rigl t Piswer Alrrk your answers heatily Do note ary marh except yoar 
ansitcrs 

If you wish to rhansr an a iswer, draw a circle around the box like Then 
mark the ni w answe- in the usual wa> DO NOT EIL\SE VNY M \KIC YOU 
H V\E M\DE ON TIIF \NS^FR PAD 


pa 


A crankirg gasoline ergines 

B bonding wood strips 


oJ w 

This 11 used in c opening cans 

D removing spark plugs 

F bonng holos in wood 



A a morkino bolt 



s a carnage bolt 

Thu u c a window bolt ' 

D a stove bolt 



E. an eye boit 


You should ha\e marked bozing holea in wood and a machine bolt (^on 
the Answer Pad 

Be sure )ou understand how to work these prublems When the examiner gives 
the sigiiid, you are to work more problems like those above 

Work quicklv, but try not to make mistakes You will base 10 ninutrs, but are 
not expected to finish in this time FAere are FIVE pa^es of proolens 

(By permission of Rulidiclson, Bellows, Henry, and Co, Inc and The Science 

Rescan h Associates ) 

success in all drafiing woik and lu engineering design The test can be 
made veiy difhcult by mrluchng pictures of 3-diiiiensional objects 
and roiating them in thicc diiections as in the Guilford-Ziinmerman's 
Test (Ulus 86). 





268 ACHIEVEMENT AND APTITUDE 


ILLUS 94. THE REVISED MINNESOTA PAtER FORM BOARD 


READ THE FOLLOWING DI- 
RECnONS VERY CAREFULLY 
WHILE THE EXAMINER 
READS THEM ALOUD 

Look at the problems on the ngrht side of this 
pag:e You will notice that there are eiffht of them, 
numbered from 1 to 8 Notice that the problems 
go DOWN the page 

First look at Problem 1 There are two parts m 
the upper left-hand comer Now kiok at the five 
figures labelled A, B, C, D, E You are to decide 
which figure shows how these parts can fit to- 
gether. Let us first look at Figure A. You will 
notice that Figure A does not wk. like the parts 
in the upper left-hand would look when fitted 
together. Neither do Figures B, C, or D Figure 
E does look like the parts in the upper left-hand 
comer would look when fitted to get her, so E is 
PRINTED m the square above U] at the top 
of the page 

Now look at Problem 2 Decide which figure is the 
correct answer As you will notice, Figure A is 
the comot answer, so A is printed m the square 
above DS at the top of the page. 

The answer to ProMem 3 is B, so B is printed in 
the square above [H at the top of the page. 

In Problem 4, D is the oorrectwiswer, so D » 
printed m the square above (i] at the top of 
the page 

Now do Problems 5, 6, 7, and 8 

PRINT the letter of the correct answer in the 
square above the number of the example at the 
top of the page. 

DO THESE PROBLEMS NOW 

If your answers are not the same as those which 
the examiner reads to you, RAISE YOUR HAND. 

DO NOT OPEN THE BOOKLET UNTIL YOU 
ARE TOLD TO DO SO 

Some of the problems on the inside of this booklet 
are more difficult than those which you have al- 
ready done, but the idea is exactly the same In 
each problem you are to decide which figure shows 
the parts correctly fitted together. Sometimes the 
parts have to be turned around, and sometimes 
they have to be tamed overm order to make them 
fit In the square above m write the correct 
answer to Problem 1, m the square above d] 
write the correct answer to Problem 2, and so on 
with the rest of the test Start with Problem 1, 
and go DOWN the page After you have finished 
one column, go right on with the next Be careful 
not to go so fast that you make mistakes. Do not 
spend too much time on any one problem 

PRINT WITH CAPITAL LETTERS ONLY. 
MAKE THEM SO THAT ANYONE CAN READ 
THEM. 

DO NOT OPEN THE BOOKLET BEFORE YOU 
ARE TOLD TO DO SO 

YOU WILL HAVE EXACTLY 20 MINUTES TO 
DO THE WHOLE TEST 


[IIlA][6][l 

[IllIlSE] 





□ □□□ 

[5] [^00 



(Courtesy of Likert and Quasha, 1934) 


Computation 

Nearly all mechanical work includes accurate measurement and 
some computation Lawshe and Mountoux (1942) issued an Indus- 
trial Training Classification Test whrdfi revised an earlier form. This 
includes reading dimensions from mechanical drawings of an ir- 
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regular block with fohr holes, of a lot with several buildings, of a 
bolt with two nuts, and of a fountain pen. Addition, subtraction, 
multiplication, and division of whole numbers, fi actions, and deci- 
mals are required More recently such tests have appeared in all bat- 
teries for mechanical prediction. 

Batteries 

Batteries of tests of mechanical ability are now available at college 
and high school levels. Moore, Lapp, and Griffin (1943) developed 
an Engineering and Physical Science Test with six parts. 

I. Mathematics 25 problems involving algebra and square roots 
11 Foimulation 10 problems requiring a verbal statement is to be cast 
into an algebraic formula 
III Physical Science Comprehension 45 items 
IV, Aiithmetic Reasoning 10 items where algebra is very helpful 
V Veibal Comprehension - 43 technical and literary items 
VI Medianical Cowjnehension 22 items o[ mechanical re.isoning 

This test has had a ^Mtlc application to candidates ioi admission to 
engineering schools 

At the high school level and £oi average adults the SR A Mechan- 
ical Aptiiudes Test, prepared by Richardson, Bellows, Plenry, and 
Compan} Inc (1017), contains thiee parts Part I consists ot iorty- 
five pictures ot commonly used tools, followed b\ five written oi 
printed choices (Jllus 93) Sometimes the object is simply to be 
named, but more often its use is to be stated The questions cover a 
wide variety of tools for metal, wood, and drafting opciatious Fail 
II is a Spatial Relations test which consists ol four simple key figures 
which remain the same and aie piinred on each page, and forty items, 
each one ol which repiesents one of the key figures cut into two or 
three separate pieces One must indicate which of the foui key figures 
would be icconstructed if the smaller sections w'eie properly fitted to- 
gether. In some instances pieces must be rotated TLliis part is similar 
to the paper form board (Ulus. 94). Part 111 is a Shop Arithmetic 
test in w'hich most ol the items are based upon drawings or tables 
These requiie the use ol language as well as computation and rea- 
soning Each problem has four choices and an alteinativc, “None oi 
these.” 

The first iw’o tests are allow'cd 10 minutes each and the last one 15 
minutes Separate scores are gnen for each part ol the test because 
the authors leel that maximuin validity or prediction of icsults Avill 
be found il the usci develops the best weights for his particular situa- 
tion. Norms tor total scores are also given The correlation between 
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the three parts is in the neighborhood of .35, and the Kuder-Ricliard- 
son Reliability is approximately .90 for a small sample of high 
school graduates attending trade schools. Norms are given for a wide 
variety of trainees and technical apprentices as well as for some jour- 
neymen 

Wrightstone and O'Toole (1946) issued the Prognostic Test of Me- 
chanical Abilities for grades seven through twelve and adults It in- 
cludes* 

1. 15 arithmetic problems featuring fractions and decimals 

2. 15 problems of reading directions from simple mechanical draw- 
ings 

3. 20 multiple-choice items on the identification and use of tools 

4. 15 multiple-choice items of a complex paper form-board type 

5. 15 multiple-choice items of measuring parts of eight drawings 
with a ruler 

Total reliability (about .90) is somewhat higher for older students. 
Gentile norms for 5,268 boys in seven states are furnished by 
grades. 


MOTOR COORDINATION TESTS 

The field of motor coordination has received more attention in 
laboratory studies than in the standard testing situations. This is 
probably due to the fact that thorough motor testing employs rather 
extensive mechanical equipment and demands individual adminis- 
tration. A few fairly well-standardized procedures using simple mate- 
rials have been developed, which will be discussed here under five 
headings: reaction time, agility and strength, dexterity, steadiness, 
and motor rhythms. 

Reaction Time 

Reaction times have been extensively studied for both total and 
partial responses Total reaction time is the period between the ap- 
plication of a stimulus, such as the sound of a bell, and some muscular 
response, such as the release of a telegraphic key. Partial reaction 
time is the period between the application of a stimulus and a change 
in some portion of the total reaction path. 

Numerous studies of total reactions of the simplest sort have been 
extensively discussed by Woodworth (1938). The evidence is con- 
clusive that for adult laboratory subjects 

1 Reaction times vary with sense organ stimulated Approximate mean 
reaction times in seconds, as given by moving a finger as quickly as possible 
after a model ate stimulus, were for 
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touch on the hand * 

120 

pain 

1 000-f 

touch on the forehead 

.130 

smell 

1.0004- 

sound 

.130 

salt taste 

.307 

light 

.180 

sweet taste 

,496 

cold 

150 

acid taste 

536 

waimth 

180 

bitter taste 

1082 


2 Reaction times vary with the intensity of the stimulus; in general, the 
more intense the stimulus, the quicker the response 

3 Reaction times vary with the nervous connections Thus, the response 
of tlie right hand to stimulation of the right hand is faster than to the 
stimulation of the left hand or of either foot 

4. The reaction time varies with the reacting movement A well-practiced 
movement is quicker than an unfamiliar movement. The force used to close 
a switch is much greater than necessary at first, but later in practice a smaller 
well-directed movement is used 

5 Ready signals are very important Regular signals about one-half 
second before the stimulus result in shorter reaction times than longer, ir- 
regular, or shorter signals 

6. Discrimination reactions, such as pressing a key if a sound comes 
from the left but not if it comes from the right, are much slower than simple 
reactions 

7 Associated reaction times, such as calling out the first word that is 
suggested by a stimulus, vary greatly. The more familiar associations are 
nearly as fast as simple response time to visual stimuli, but the less familiar 
or the emotionally toned responses are much longer. 

Van Essen (1935) reported a number of interesting observations in 
connection with studies of auto drivers Manual reaction times to 
auditory and visual stimuli varied with the type of stimulation, thus 

L A short stimulation gave slightly shorter reaction times than the dis- 
appearance of a continuous stimulus, and a much shorter reaction time 
than the appearance of a continuous stimulus 

2. The reaction times became shorter when subjects’ reactions affected 
the stimulus, as, for instance, when lifting one’s foot from a pedal extin- 
guished a light, than when the reaction did not affect the stimulus. 

3 Reactions to a more distant red light (24 meters) were faster than to 
nearer lights (12 meters or 2 meters) when intensity and size of retinal 
stimulation were held constant 

4. Repeated react;mn times did not fall into a normal curve of distribu- 
tion for each individual, but into several patterns which depended upon 
qualitative differences in reaction patterns. 

Another study, showing growth and practice norms, is that of 
Jones (1937), who instructed his subjects as follows: 

This is a test to see how fast you can move your hand You place your 
hand on this board and hold it down* in a moment a light will appear in the 
red bulb This is a signal that in 1, 2, or 3 seconds a buzzer in the clock will 
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sound. When you hear that buzzer, you must lift ^our fingers off just as fast 
as you can (demonstrating). At the sound of the buzzer, a little electric clock 
starts, the clock stops when you lift your fingers, and by reading the dial we 
can tell how quick you are 

After preliminary trials, fifteen trials were taken with each hand. 
Some of the results were: 

1. Odd-even reliability correlations for fifteen trials were approximately 
.87 for boys and 89 for girls 

2 No marked differences were found between mean scores for right and 
left hands Correlations between right and left hand responses ranged from 
.80 to 86. 

S. Retest correlations on groups of ninety children after a year’s interval 
were from 60 to 72 and after 3 years, from .55 to 57. 

4. Small practice effects occurred in eleven-year-old children on the second 
day but not thereafter during 4 days of practice 

5 Males were slightly supeiior to females, particularly for the left hand 
and at the earlier ages 

6 A marked warming-up effect was noticed during the first five trials 
with the right hand and during all fifteen trials of the left hand. 

7. Age norms for right-hand reaction times were. 


Year 

No of Cases 

Mean 

45 

37 

398 3 

73 

34 

250.8 

10 9 

76 

186 7 

147 

SO 

162 5 

19 5 

40 college students 

156 7 


8. There was some evidence that adaptation was present in groups where 
the test was repeated four times at yearly intervals. These groups surpassed 
their controls. 

9. Motivation was found to be very important The subjects were allowed 
to see their own reaction times on the face of a chxonoscope and were urged 
to break their own records. 

Studies of total reaction times have convinced investigators that a 
response is a complicated affair which should be analyzed into its 
constituent parts. An early analysis was made by Exner (1868), who 
described the following seven segments of a reaction: 

1. Time needed for excitation of sense organ 

2. Conduction through a sensory nerve 

3. Conduction from lower spinal cord to lower brain centers 

4. Conduction from sensory to motor brain centers 

5 Conduction from motor center to lower cord 

6 Conduction through motor nerve 

7. Muscular movement 



MECHANICAL AND MOTOR TESTS 273 

These or similar segments have been studied by means of small 
electrical conductors placed in or near various parts of the nervous 
pathway One variety of this technique measures the time between a 
particular stimulus and the changes in the resistance of the skin on 
some part of the body. This reaction is often called a psychogalvanic 
reflex. An excellent review of research dealing with the galvanic skin 
responses is given by Woodworth (1938). Another variation of the 
technique records small changes in electrical resistance which occur 
in brain tissue or in the scalp. Photographic records of such changes 
arc called electroencephalographs This fascinating field, which is 
being lapidlv dc\cloped, is reviewed periodically m Child Dexfelop- 
merit Abshaefs 

Agilitv and Strength 

The tests for inlanis desci ibed in Chapter V were found to contain 
large sections devoted to appraisals of postuic, hand and eve co- 
oidination, and locomotion 'Tests of similar iiatteins ol behavior 
have been developed foi older children and adults, and the norms 
in some cases are lairly adequate 

A seiics of niotoi tests for children from two to six years of age 
v\erc lenuitivclv standaidized on ninety-eight children bv McCaskill 
and Wellman (1938), who describe the tC'its as iollows (p IT I) 

It was the idea fioin die beginning to keep each aciivit) in the test 
situation as smijile as possible so that it could be rcpioduced c.isih lor the 
ball throwing and ball bouncing a "local ion field" was devised lor cletenuin- 
ing the distance ,incl diiection of the child's thiow oi bounce. This field 
w\'is made of hcavv blown ]>apei 8 feet wide and 17 fec't long with a 4 inch 
sliip of wall board at e.ich end to anchor it. The width was marked off into 
zones Zone 1 was 2 feet in width and extended clown the center ol the 
field, with zones 2, 3 and 4 each 1 loot wide to the right and lelt of /one 
1 The length w’as marked off into distances distance 1 w'a-* 3 fc'Ct iii length 
and distances 2, 3, -1, 6 7, and 8 each 2 feet in length. Foi the tlnow or 

bounce the child stood at the edge of and on the centc'i line of the field and 
threw’ or bounced the ball to the experimenter at the opposite end ol the 
field The results wcie recorded according to zone and distance and the 
child’'* use of one oi both hands 

In the catching scries the ball was thiown to the child at a level with his 
chest eac'h time, as iioaih as was possible The method used m the attempt 
to catch the ball, the successes, and wTicthci or not the child used defense 
movements when the ball was tossed to him were all recorded Two balls 
were used thioughout, one IG'Vj niches in circumference, tlie odier OY 2 
inches Three trials were given loi each pci lormance w iib each ball 

lo deternune the cliild’s ability to maintain cqiiilibiiuin and balance, 
a walking path and a cncle were used A demonstration W’as given in each 
case If the child failed to respond, he was asked to lollow the experimentci 
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as she walked on the path or circle. The path, 10 feet long and 1 inch wide, 
was drawn on a large piece of brown paper and colored red The circle 
was cut from wall board and was 4 feet in diameter A strip (1 inch wide) 
was drawn around its outer border and colored red like the path The 
number of times the child stepped off the path and circle on each trial was 
recorded Three trials were given on each. If the child set one foot off the 
line for balance but did not take a step, he was not penalized 

Four heights were used for the jumping — ^boxes 8, 12, 18, and 28 inches 
in height. The child was given three trials at each height and was checked 
according to the method he employed in the jump 

The child was asked to hop on one and on both feet and a demonstration 
of each w^as given him. He was given three trials at each. The stages were 
recorded in number of steps Four items were listed under skipping* walk- 
ing, shuffle, skipping on one foot, and skipping on alternate feet The ex- 
perimenter skipped and asked the child to follow her. The shuffle has a 
rhythmic quality — die same foot is always forward at each advance. It is 
definitely in advance of walking and is not the same high step the children 
employ in galloping. 

Both ascending and descending steps were tested on short and long flights 
The short flight had 4 steps 7 inches in height with an 1 1 inch tread The 
long flight had 11 steps of the same height and tread The rail on both 
flights was 29 inches m height. The steps on which the kindergarten chil- 
dren were tested were slightly different The short flight had 4 steps 7 inches 
in height with a tread of 10 inches and the long flight 12 steps 614 inches 
in height with a tread of 1 1% inches. The hand rail was 34 inches above the 
steps The child had three trials on ascending and descending both flights 
of steps. 

For ladder climbing two ladders were used One had twelve rungs 6 inches 
apart, the otlicr six rungs 12 inches apart The ladders were placed at ap- 
proximately a 45 degree angle each time. This placement was kept constant 
by placing the same ladder rung against the support each time 

In scaling these tests, months of motor age were assigned to each 
observed stage of de\elopmeiit, or item, fiorii the results of testing 
the standaidi/tiLion group. Age equivalents wcic calculated using 
lliLirstone's (1925) method, which shows the age at which 50 per 
cent oi the group passed the item, and 50 per cent failed to j^ass. 
Illustration 95 sho%vs the motoi ages ior each item, which, inciden- 
tally, correspond closely to values found by Bavley (1935) for hop- 
ping, walking, and stair climbing among two- and tin ee-y ear-olds 
lllustiation 95 also shouts a point scale used for individual scores. 
Separate scores were provided lor four kinds ol activities by summing 
the points made on groups of items railed (1) Steps and Laddcis, (2) 
Bali Activities, (3) Jumping, and (4) Hopping, Skipping, and Walk- 
ing Norms for total scores weie also published TLlie mean correla- 
tions betw^een these types of motor skills lor all ages combined w^ere 
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ILLUS 96. MOTOR ACHIEVEMENTS SCALE 

Months Score Hcm 

5^1 4 bouncing large ball, one band, distance 1 

68 6 catching large ball, elbows at side of body, success on 2 or three trials 

65 3 bounang large ball, both hands, distance 3 

65 7 throwing small ball, both hands or one hand, distance 7 

63 S throwing large ball, both hands or one hand, distance S 

62 4 descending large ladder, alternate feet, with facility 

60 4 hopping on one foot, 10 or more steps 

60 3 skipping, alternate feet 

57 6 throwing small ball, both hands or one hand, distance 6 

56 3 descending large ladder, alternate feet, with caution 

55 5 catching small ball, elbows at side of body, no success on one trial 

55 4 descencfing long steps, alternate feet, unsupported 

55 3 hopping one foot, 7 to 9 steps 

53 4 throwing large ball, both hands or one hand, distance 4 

S3 4 descending small ladder, alternate feet, with facility 

52 5 throwing small ball, boUi bands or one hand, distance 5 

51 3 descending smallladder, alternate feet, with caution 

51 5 catching large ball, elbows at side of body, no success or success on one trial 

SO 4 catching small ball, elbows in front of body, success on 2 or 3 tnals 

49 4 descending short steps, alternate feet, unsupported 

48 3 descending long steps, alternate feet, with support 

48 3 descending short steps, alternate feet, with support 

47 4 ascending large ladder, alternate feet, with facility 

46 2 bouncing large ball, both hands, distance 2 

46 3 jumping 28 inches, alone, feet together 

46 2 hopping one foot, 4 to 6 steps 

45 3 ascend^ large ladder, alternate feet, with caution 

45 3 walking circle, no steps off 

44 4 throwing small ball, one hand or both, distance 4 

44 4 catching large ball, elbows m front of body, success on 2 or 3 trials 

43 1 hopping one font, 1 to 3 steps 

43 2 jumping 28 inilns, alont‘, one foot ahead 

43 3 throwing large ball, both hiinds or ore hand, distance 3 

43 2 skipping on one foot 

42 4 hopping Iio^h feet, 10 or more steps 

41 4 ascending long steps, aUei nate feet, unsupported 

41 3 bopping two f( ( 1, 7 to 9 steps 

40 2 hopp>ng two fcit, t to 6 steps 

40 5 bouncing small ball, one band, dibtanie 2 

38 2 desccii(liiig laigc laJdei, niaik time, wiih Facility 

38 4 ascending ■'mall ladder eltcrnatL feet withiacdity 

38 3 catching small baU, elbows in front of boc1> , no success or succcbses on one trial 

38 1 skipping and ••hiiflle 

38 1 hoppMig both fc( l, 1 to 3 steps 

37 2 catching small bc>ll, arms stra'ght, success on 2 or 3 tnals 

37 3 jumping 18 inches, fc( t together, alone 

37 3 walkmg jialh, ro stcjis off 

36 1 jumping 28 inches, with h( Ip 

35 3 catching large ball, elbows in front of body, no success or success on one trial 

35 2 walking circle, 1 to 3 steps off 

34 2 catching large ball, arms straight, success on 2 or 3 tnals 

34 2 descending long steps, mark tunc, unsupported 

34 3 jumping IJ inches, alone, leet togelhc'i 

34 3 ascending small ladder, alte mate feet, with caution 

33 2 ascending large lacldei, mark time, with facility 

33 3 throwing small ball both hands or one hand, distance 3 

33 3 jumping 8 inches, alone, feet together 

31 3 ascending long steps, alternate feet, with support 

31 4 ascending short steps, alternate feet unsupported 

31 2 jumping 18 inches, alone, one foot ahead 

31 2 walking path, 1 to 3 stens off 

30 2 throwing large ball, om harn or both hands, d''*lance 2 

29 2 throwing sn?all ball, one hand or both hands, distrnet 2 

29 3 ascending shoit steps, alternate feet, with support 

29 2 ascending long steps, mark lime unsupported 

28 2 ascending short steps, Tn.irk unic, unsupported 

28 1 walking circle, 4 to 6 steps off 

28 1 walking path 4 to 6 sttfis c*ff 

24 4 bouncing *>mall bail, one hand, distance I 

24 2 ascending short steps irark time, unsupported 

24 1 jumping 18 inches, with help 

24 2 jumping 12 inches alone, one foot ahead 

24 1 descending large ladder, mark Lime, with caution 

(McCaskill and Wellman, 1938, p 148 By permission of Society for Research 
tn Child Development) 
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approximately 69 for boys and 75 for girls' The true correlations 
between these skills would doubtless be much lower if age were held 
constant. The lowest correlations were between Ball Activities and 
Steps and Ladders, and the highest between Steps and Ladders and 
Hopping. 

Forty-six youngsters were retested within one week. The total- 
score letest reliability was 98 Among the subtests the lowest reliabil- 
ities were in Ball Activities and Walking Around a Circle The au- 
thors believe that the lower reliabilities of the Ball Activities scores 
were due to the fact that scoring in this instance was more minute 
and detailed than in some of the other tests The boys appeared to be 
slightly superior to the girls on Steps and Ladders and Ball Activities, 
and the girls were a little ahead in Hopping and Skipping. The 
scores discriminated well between two- and five-year-old groups but 
not between five- and six-year-old groups, because the items are not 
difficult enough to discriminate among the six-year-olds who were 
above average. 

An excellent summary of tests of agility and strength is given by 
Bovard and Cozens (1938). They include norms for anthropometric, 
cardiac, strength, athletic information, and physical efficiency meas- 
ures among elementary school, high school, and college groups. Jones 
and Seashore (1944) reported a careful developmental study of fine 
motor and mechanical abilities during adolescence. 

Dexterity 

Tests which are classified as measures of dexterity characteristically 
appraise routine or serial perceptions and movements in terms of 
speed, accuracy, endurance, and force. In certain dexterity tests, such 
as simple tapping, perceptual discrimination is probably a minor 
element, and the score is largely determined by muscle and nerve 
functions. In more complex tests, such as serial aiming or pursuit 
tests, the perceptual elements may be more important in success than 
motor coordination. Moreover, there seem to be at least two nearly 
unrelated types of motor coordination (1) ballistic movement, in 
which an extremity is thrown or moved rapidly m some highly au- 
tomatized pattern, illustrated by simple tapping with a stylus, and 
(2) precision movement, m which voluntary control is continuous, 
as in serial aiming or tweezer dexterity 

Among the simpler ballistic tests is the tapping test described by 
Whipple (1914) and used in many studies since that time The ap- 
paratus for the tapping test consists of a metal plate 2 inches square, 
a metal stylus, and an electric counting device. The number of taps 
recorded in five 6-second trials, allowing 30 seconds of rest between 
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trials, was used as a score by Paterson et al (1930). Others have used 
various time limits of from 5 to 60 seconds. Maximum retest correla- 
tions of about .91 were found by Muscio (1922), Gates (1928), and 
Greene (1931) under optimum conditions. Greene reported that 10 
seconds was the optimum time limit for single trials of college stu- 
dents Shorter periods seemed to introduce random variations, and 
longer periods were influenced by individual differences in fatigue. 
He also found that speed of tapping for 10 seconds with paper and 
pencil, a group test, correlated .77 with a 10-second trial on the 
Whipple apparatus. 

Another tapping test which seems to measure about the same sort 
of coordination is pressing and releasing a telegraphic key as rapidly 
as possible A fairly large number of researches have used tapping 
keys and automatic counters. 

Tapping tests among adults have been reported to show nearly 
zero correlations with other motor tests and with mental test scores. 

One of the earlier batteries of tests which emphasized precision of 
movement was that of Whitman (1925), whose series included the 
following act nines 

1 Piiitmg one biass pin in cadi hole drilled in a laige board (each 
hand, 1 inin ) 

2 Putting thice brass ])ins m each hole (using both hands, 2 mins ) 

3 Assembling mils and bolts (30 secs ) 

<l Disassembling nuts and bolts (30 sees) 

3 Soi ling difleient colored pegs (30 secs ) 

6 Placing pegs of a particular color in order on a peg board (1 nun ) 

^\^hitmaii furnished age norms for groups from seven to fifteen 
\eais of age O'Coiinoi (1928) used a test similar to No. 2 and also 
described a test called Twcc/ct Dexterity, which requires a pci son 
to place metal pins, one at a time, in one hundred holes drilled in a 
metal plate using a small pair of tw’eezcis. These tests ha\c good 
leliability and ha\e predicted success fairly well in jobs which re- 
quire delicate assembly of small apparatus 

Ihere are now available about twenty dexterity tests which have 
been partl> standardized (Appendix IT) One oi the best is the Pui- 
duc Dexterity Test (Ulus 96), wdiicli lias two parts In one, short 
metal rods aic to be placed iri lows of holes m a boaid, with each 
hand separately and with both hands together In the other part, 
skilled fingei dcxteniv is requiied when both hands are used to 
assemble pins and washcis and place them in holes The tw’O tests 
can be given to ten workers at once and requiie only 214 minutes of 
testing time, following diiectxoiis and piaciicc pciiods. 
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XLLUS. 96. PURDUE DEXTERltY TEST 



(By permission of Dr. Joseph Tiffin.) 


Another dexterity test which involves timing is the Purdue Hand 
Precision Test (Ulus. 97). Here three half-inch holes are uncovered 
by the rotation of a disc at the rate of 126 holes per minute. The holes 
are located at the comers of a triangle 3.5 inches to a side. One is 
asked to put a stylus into each hole as it is uncovered, without touch- 
ing the sides or being caught by the shutter. After a 30-second prac- 
tice period, a 2-minute test is given, during which a clock records the 
error time, that is, the seconds of contact between the stylus and the 
sides of the hole or the shutter. 

A somewhat similar test is the Bennett Hand-Tool Dexterity Test 
(Illus. 98), in which the subject is presented with a U-shaped wooden 
frame, on the left-hand upright, of which there are mounted twelvp 
bolts (three sizes of four each). The task is to remove the twelve 
bolts, nuts, and washers from the left side and to dissemble them in a 
proscribed sequence on the right side. The tim^ required is the score. 
For a large group of adults the range of scores was from 4 to 12 
minutes, average 6% minutes. Test-retest correlations were approxi- 
mately .91, and correlations with foremen's ratings on mechanical 
work was reported in the neighborhood of .45. 

Steadiness 

In this group are found measures of both large- and small-muscle 
groups. Paterson (1930) has described a test of steadiness of 
large-muscle groups, called the Body-Balancing Test. The test'^ 
used to discover how long a person can balance himself on a 3'-m^ . 




MECHANICAL AND MOTOR TESTS 

IIXUS. 97. JPURDUE HAND-PREOSION TEST 


(By permission of Dr. Joseph TifiSn.) 

IIXUS. 98. BENNETT HAND-TOOL DEXTERITY TEST 


permission of The Psychological Corporation.) 
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cube of wood on the floor, using only the ball of one foot for support 
A simple machine for recording postural steadiness, called an ataxia- 
meter, has been described by Miles (1922) It consists of a square, 
balanced board upon which a person stands erect on both feet and 
with eyes closed. Each corner of the board is connected with an in- 
strument which records its vertical movements Great steadiness is 
indicated by little total movement A similar device, called a wabble- 
meter, has been designed by Moss (1931). 

A steadiness test of hand-and-eye coordination has been described 
by Whipple (1914). A brass plate with nine holes of various diam- 
eters, a metal stylus, and an automatic counter are used. The sub- 
ject is required to hold the stylus in each hole 15 seconds while trying 
not to let it touch the edge. The hand and arm are extended and free 
from support. The score is the number of contacts that are electri- 
cally recorded. The median retest correlation reported by Paterson 
et aL (1930) was .62 for scores on single holes and .76 for total scores 
among 217 boys Whipple also standardized a thrusting or aiming 
test with this apparatus. 

The existence of a steadiness factor has been suggested by Sea- 
shore and Adams (1933), who applied five tests of steadiness to fifty 
students The tests were Miles* ataxiameter, Beall and HalPs ataxia- 
graph, Seashore’s modification of Whipple's steadiness test for both 
“position” and “thrusting,” and a rifle steadiness test The intercor- 
relations of these tests ranged from .44 to .59, median .48. These cor- 
relations are probably indicative of a general factor of steadiness in 
large-muscle coordinations. 

Motor Rhythms 

Ability to perform acts at a particular rate is considered to be 
highly important in many sorts of musical, dancing, and mechanical 
skills. It is difficult, however, to find a precise definition of rhythm, 
and a brief inspection of tests of rhythm will convince one that the 
patterns vary considerably. Subjective analyses usually limit the 
phenomena to temporal patterns and distinguish between perception 
of rhythmic patterns and their performance. Perception of simple 
rhythmic patterns can be tested by asking persons to tell whether 
two temporal patterns are the same, or whether one of two intervals 
is longer than another, as in the Seashore (1919) tests of musical 
talent. (See Chapter X.) 

The performance of rhythmic patterns doubtless requires good 
perception, but it also involves to a large extent muscle and central 
nervous system elements. An illustration of a standardized test of 
simple motor rhythm is the Seashore (1928), which requires one to 
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listen to a sequence of four notes that are played over and over again 
by a phonograph record for a short time Then at a signal, one is 
required to keep time with the sequence by pressing a telegraph key. 
The score, which is automatically recorded electrically, is the num- 
ber of taps in one minute which fall within .05 seconds of the exact 
time of the sound 

Van Alstyne and Osborne (1937) adapted this test for small chil- 
dren and also devised a rhythm memoi'y test. They found boys to be 
slightly inferior to girls. Small practice effects were noted. Liebold 
(1936) found that four-year-olds could not keep time to a metronome 
at the rate of either once or twice per second, but that half of the five- 
year-olds and nearly all of the six-year-old gi'oup succeeded well. A 
complex test of motor rhythms is illustrated by some intricate dance 
steps used by Garfiel (1923). She also required persons to attempt 
nonsynchroni7ed movements of the two hands Jersild and Bienstock 
(1935) devised an accurate mcasuie ol stepping and clapping to 
music Ibe rest scoic^ ucic seciucd Ii om mot lon-jnc lure filing, by 
counting the number ol Irauics, taken at inLcr\aJs of Y 24 second, in 
which the child ^\as in time with the music. Movements ol beating 
time with the hand and walking were counted as iri time if thc) fell 
within Yx J>^‘Cond ol the actual beat. The lelativc correlations be- 
tween two tests of four hundred beats each 'i\eie appioMiuately 70, 
and of two hundred beats, 60 Among the older chilchcn the figuies 
were somewhat higher Instiuctois* laiings were louud to be highly 
untrustworthy wdien age was held conslanr Cor relations between 
hand and foot rhythiirs were neai 82, and between hand rhythms 
and singing abilri), 30 

Rhythms in serial mental work have been studied by Bills (1937) 
Mental perfoririaiice w'as characteiistically lound to be disconlinu- 
ous. He tcpoited pauses which betame longei and nioi'e Irecpient as 
the task became more clifritiilr and as fatigue increased Elaborate 
studies ol rh)thmic performance on musical instiumcnis have been 
reported in the University of Jowa Studies of the Psychology of Music 
(Chapter X ) 

Batteries of Moior Tests 

Several battciies of motor tests have been described which seem to 
have been assembled, like a gcrieial menial test, to sample a fairly 
wide varietv of skills vviihoiii attempting a careful analysis Success 
on such batteries seems to depend, 111 vaiious and unknown piopor- 
tions, upon strength, cnchnancc, precision, steadiness, ih)thin, per- 
ceptual skill, information, and m some rases upon reasoning and 
planning Most of the batteries seek to luiiiish separate noims for 
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their component tests, thereby providing tools for subsequent anal- 
yses and refinements. 

Two tests which seem to require principally precision, rhythm, 
ballistic movements, and planning, come from the Minnesota Uni- 
versity Laboratory. The Minnesota Rate of Manipulation Test, 
Ziegler (1934), Ulus. 6, shows the speed at which a person can turn 
over 120 circular blocks which fit loosely in holes in a board. Pater- 
son et al. (1930) described a Packing Blocks test which measures the 
time needed to place 147 one-inch cubes in a wooden box. Scores for 
one trial of tliese tests were found to yield low reliabilities, but the 
sums of three or four trials usually show retest reliabilities of approxi- 
mately .90. 

Anodier widely used battery is the Stanford Motor Skills Tests, 
described by Robert Seashore (1928). Six serial dexterity tests were 
chosen from among more than twenty tests, so that 

1 They were well adapted for use in schools and factories. 

2. They took only a small amount of time (less than 2 hours). 

3. The material occupied little space 

4. They were scored automatically. 

5. Each test had high retest reliability ( 75 to .86) 

6. Each test had a low correlation with the Thorndike College Entrance 
Examination. ^ 

7. Each test had a low correlation with amounts of training in typing, 
practice on a musical instrument, and training in athletics 

8 The mtercorrelations were low (mean .25) 

The six parts of the Stanford Motor Skills battery test the following: 

1. The Koerth Punuit Test requires a person to hold the point of metal 
stylus on a metal disc, % mch in diameter, mounted on a phonograph record 
The phonograph is set to revolve once per second, making the metal disc 
follow a circular path about 8 inches in diameter. The score is the distance 
during which the contact is maintained, during 20 seconds (10 trials). 

2. Motor Rhythm (described on p 281) 

3. Tapping Key is a test of the speed of pressing and releasing a tele- 
graphic key during 5-second periods (3 trials), 

4. Seiial Discnminatton is a test of the speed of reaction to the four 
numbers, 1, 2, 3, and 4, exposed visually in random order. The reaction is 
made by pressing the correct one of four keys which correspond to the 
numbers Each key is to be pressed and released by a different finger. The 
score IS the number of correct reactions in 2 minutes 

5. Brown Spool-Packer is a test of the speed of packing spools in a small 
box, using both hands. Score is the number of spools packed in 3 minutes 

6 Miles* Drill Test is a test of the speed of rotating the handle of a small 
hand drill for 10 seconds (3 trials). 
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PUZZLE TYPE TESTS 

One group of mechanical tests involves assembly or stripping of 
devices and requires reasoning or random manipulation. Perceptual 
comparisons of shape, size, and relationships are basically important, 
but may not be highly represented in the scores. 

One of the earliest tests of mechanical assembly is the Puzzle Box 
of Healy and Fernald (1911). The glass-framed lid of a small wooden 
box may be opened by releasing, with a buttonhook, strings that have 
been made secure over pegs. The order in which the strings must be 
released can be determined by visual examination of the box, inside 
and out. Freeman (1916) designed a similar puzzle box, in which 
eleven levers had to be moved in a particular sequence in order to 
open the box (Ulus. 99). The scores of both of these tests revealed 

ILLUS. 99. FREEMAN'S PUZZLE BOX 


g B i 



(Courtesy of the C. H. Stoelting Co., Chicago, 111.) 


large differences between sexes, and 5-minute periods were found 
to be too short for average adults. 

Another test of this type is O'Connor^s (1928) Wiggly-Block Test. 
Nine similar pieces of wood, with some edges cut wavy, are to be 
fitted together to form a solid block 9 X ^ X ^2 inches. The score is 
the time needed to finish the task. 

None of these three tests has shown high enough reliability to be 
considered a good measuring technique. The low retest reliabilities 
probably reflect the feet that some persons take a long time to solve 
the puzzle on the first trial, but have a good memory of the solution 
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on second trial, and that others may enjoy' some chance successes 
on the first trial but fail to remember the solution later. 

The Minnesota Spatial Re- 
lations Test (Ulus. 100) uses 
an elaborate set of four form 
boards, each of which has 
about sixty pieces varying in 
size or shape. This is an ex- 
tension of a similar test de- 
vised by Link (1919). Pat- 
erson et al. (1930), finding 
Link’s test was too short to give 
suitable reliabilities, made it 
about eight times as long and 
achieved a retest reliability of 
.84 on a sample of 217 boys. 
The score is the number of 
seconds needed to complete 
the assembly. The task, at least 
for adults, is largely one of 
visual comparison of form and 
methodical work. 

Posed by Dr. Bing Chung ting Another type o£ mechani- 

(Manufactured by the Educational Test manipulation test requires 
Bureau, Minneapolis, Minn.) thc assembly or stripping of 

common hardware. Stenquist 
(1923) found norms for the assembly of the following ten small ob- 
jects which he purchased at local stores: 


ILLUS. 100. THE MINNESOTA 
SPATIAL RELATIONS TEST 



Clothes pin with wire spring 
Hunt paper clip 
Rubber hose shutoff 
Chain with split links 
Bicycle bell 


Wire bottle stopper 
Push button 
Small door lock 
Cupboard latch 
Mouse trap 


The objects, taken apart, are presented to the examinee one at a 
time in order of difficulty. A small screw driver is available to the 
examinee. The score is the number of correct assembly operations 
completed in 30 minutes. 

Paterson et al (1930) believed that the odd-even reliability of the 
Stenquist Assembly Test, which was found to be .72 among 217 
seventh and eighth grade boys, was not high enough to be considered 
careful appraisal. Therefore they modified and enlarged the material 
to include thirty-six items in the Minnesota Mechanical Assembly 
Tests (Ulus. 13). Success in each of these, which was scored on a 
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10-point scale, was found to correlate .28 or better with total scores. 
The highest correlations with totals were found to be in the assembly 
of the bicycle bell, .75; the clothes pin, .68; spark plug, .66; and 
chain, .64. The odd-even correlation of the total test (using the 
Spearman-Brown prediction formula) was .94. 

The Purdue Mechanical Assembly Test (Ulus. 101) consists of 


ILLUS. 101. PURDUE MECHANICAL ASSEMBLY TEST 



(By permission of Dr, Joseph Tiffin and the Purdue Research Foundation.) 


eight similar boxes, each of which contains parts to be assembled 
levels, gears, racks, pinions, and worms. One is shown the nature of 
the task by using one box to illustrate the task, then a certain time 
is allowed for assembling each of the other seven boxes. No familiar 
objects are included, and all principles of mechanical operation are 
used. A reliability of .88 is reported, and correlations of from .35 
to .55 with supervisor ratings of machinists and machinist apprentices. 

CORRELATION ANALYSES 

A number of correlational analyses, which include performance 
and motor tests, have been cited in Chapter VIII. Several typical 
analyses, planned to evaluate factors in mechanical and motor abil- 
ity, are described here. Many persons have questioned whether there 
is a general motor ability which underlies success in all kinds of 
muscular skills. The almost universal conclusion is that, although 
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certain motor skills show high positive intercorrelations, there is no 
general or elemental motor factor. Perrin (1921) and Muscio (1922) 
came to this conclusion after applying balancing, tapping, reaction 
time, strength, and dexterity tests to small groups of adults. 

Garfiel (1923) attempted to find an answer to this question by 
intercorrelating the results of sixteen tests and ratings of fifty col- 
lege sophomore girls. Using the appraisals listed in Ulus. 102, she 
found that the correlations with Alpha scores were nearly zero, and 
that the median of all the sixty-six intercorrelations of motor tests 

ILLUS 102 TESTS USED IN A STUDY OF MOTOR ABILITY 


Test Retest Correlation 

RduxhilUy with Criteria 

1. Mental. United States Army Alpha Score 92 . , . .02 

2. Motor Speed. 

a Tapping speed Hand metal stylus, 60 seconds ... 69 , , .22 

&. Foot speed Stationary running, 30 seconds . . . 76 . . . J23 

c. Running 100 yards indoor track 85, . . ,63 

3. Motor Co-ordination, 

a. Steadmess* Brass plate and stylus, 10/64-inch hole, 

contents m 60 seconds 61 . , , .19 

Three-hole test Insert stylus, 100 times 60 

c Target throw Tennis ball, 12 feet, five throws . .07 . . . ,20 


d. Picking up paper with teeth Stand holding right toe 
in left hand, crossmg right foot behind the body, then 
leaning over and grasping a piece of wntmg paper made 
to stand on the floor by folding once the long way. 
Passed if accomplished in 60 seconds with less than 


three falls .... . .44 

«. Tricks Difficult hand and foot co-ordination 29 

4 Preferences A check list of 12 things to do on a June 

afternoon, 6 of which were athletic games .31 

5. Anatomical* 

a. Height ,02 

h Weight 23 

6 Strength 

a Hand dynamometer 70 ... .25 

h. Back dynamometer 81 . ,40 

c Leg dynamometer 71 . . . 20 

d. Chest strength 22 

e. Lung capacity 28 

7 Criterion . Ratmgs of 6 judges on motor ability as shown 
by strong accurate quick movements Retest rehabihty 


of mean ratings, after 16 weeks was . 92 ... ,92 

(Garfiel, 1923 Arranged from Tables VI and VII. By permission of the 
Archives of Psychology ) 
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was .15. Eleven of these were negative; twenty-five correlations lay 
between zero and .20, and only four were above .40. However, a 
battery of tests was selected by a multiple-correlation technique 
which agreed with the criterion with a correlation of .79. The battery 
included the eight tests shown in Ulus, 103 The first test, Running 
100 Yards, was allotted by 
this technique a weight 
equal to approximately 
twice the weight of all 
the other tests together. 

The fourth test, Tricks, 
was allotted one third of 
the weight of the running 
test. In order to raise the 
prediction of the criteria, 
scores of two of the tests. 

Steadiness and Leg Dyna- 
mometer, were subtracted 
from the total. Garfiel con- 
cluded that there was as 
much evidence for the ex- 
istence of a general motor factor here, as for the existence of a general 
mental factor among mental tests. Small and zero correlations m any 
battery of reliable tests, however, are probably more indicative of the 
presence of several unrelated factors. 

Seashore (1930) selected eight tests of serial motor skills and ap- 
plied them to fifty college men. He concluded that the marked 
independence of highly reliable tests argued against the existence 
of a general motor ability 

A thorough study of mechanical abilities of high school bo}s was 
made by Paterson et aL (1930), who measured moie than 150 boys 
on seven reliable tests- 

1. Minnesota Assembly Boxes A, B, and G 

2. Minnesota Spatial Relations Tests A, B, C, and D 

3. Minnesota Paper Form Boards A and B 

4 Card Sorting 

5. Packing Blocks 

6. Nine Hole Steadiness 

7- Stenquist Mechanical Aptitude Picture Tests 

The authors also evaluated interest, home activities, technical in- 
formation, and the quality and quantity of articles made in the shop, 
and concluded that the low intercorrelations among different meas- 
ures of mechanical ability suggest a high degree of speciheity Me- 


ILLUS 103 PREDICTION OF GENERAL 
MOTOR ABILITY 



Multiple 

Correlation 

L Running 100 yards 

.63* 

2. Picking up paper with teeth 

.70 

3 Back dynamometer 

.74 

4 Tucks Complex condition 

.75 

5 Steadiness 

.77 

6 Leg dynamometer 

.78 

7 Tapping 

.11 

8 Hand dynamometer 

.79 

* Numbers below this one show increases 
due to adding each test to the previous total. 


(Garfiel, 1923, Table IX By permission of 
Archives of Psychology.) 
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chanical ability as measured by the tests aiid shop ratings had no 
relation to test scores of intelligence or agility, and little corre- 
spondence witii ratings of environment. With tests requiring no 
mechanical information, they found no mean differences between 
academic and mechanical or engineering students on either the 
high school or the college level 

A factorial analysis of motor abilities among seventy-six high school 
boys was made by Buxton (1938), using Thurstone's method He 
selected tests which 

1. Included both simple and complex behavior 

2 Avoided the influence of fatigue 

3 Had high interest value 

4 Allowed simple directions 

5. Included only serial, repetitive action, where no choices were to be made 

The tests used by Buxton were 

1. Steadiness (thrusting). Student (S) thrusts a stylus into holes on a 
brass plate. The score is the number of thrusts which do not touch the 
plate, made from a standard distance at a constant rate. Ten trials were 
made at each hole The holes were smaller as the test progressed. 

2 Steadiness (stationary). Student (S) holds the stylus m each hole for 
10 seconds. The score is the number of contacts with the plate. 

3 Tapping (three discs). Student (S) taps three metal discs m succession 
as fast as he can. The discs, 2 inches in diameter, are placed at the comers 
of a 6%'inch equilateral triangle. 

4. Tapping (two bars). Student (S) taps as fast as possible between two 
vertical metal bars, 2 inches apart, with a stylus mounted on a frame strapped 
to the forearm to prevent wrist motion. 

5. Tapping (wrist turn). Student (S) turns an aluminum handle as fre- 
quently as possible through an arc of 135 degrees 

6 Packing (cubes). Student (S) fills a low box with 64 l^ie-inch cubes 
as quickly as possible. 

7 Packing (spool). Student (S) fills a tray with spools as frequently as 
possible in a short time interval. 

8 Rotor (mobility). Student (S) turns the handle of a small hand drill as 
fast as possible. 

9 Rotor (pursuit). Student (S) tries to keep a stylus on a dime-sized disc 
which IS rotating on a phonograph-like turntable 

The tests were repeated six or eight times, and gams on the four 
tests whxch showed the greatest average gam were included with the 
test scores in the statistical analysis. The analysis resulted in six fac- 
tors, of which two were tentatively named by the author. One factor 
was called steadiness, because it was found to be significant only in 
the two steadiness tests and in them it was large. The second factor 
was labeled manipulation. It was found to be large in the two pack- 
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in^ tests. Tt therefore nught be described as ]Mnd-aiid-eye coordina- 
tion in pichcnsion, short-cairy, and placing opeiations None ofc 
the other (oiii iactors showed large loadings in any tests One factor 
was found in lests which involved the laiger muscles o( the forearm, 
upper aim, and shoulder Another lactoi was lound oiil) in the 
2-bar tapping, wheic no visual tontiol was needed, and w’hcie the 
arm movenumts were ballistic, that is, without picrisc coniiol altei 
being initiated Anothei lactoi was laiil> large in wiist-tiiriiing tests 
and anothei in packing-test gains Ihc aiitlior concludes that no 
general niotoi (actor was clcinonstiaied heic, and that a larger bat- 
tery of tests would be needed in ordei to show the patterns which 
may exist 

A factoiial anal)sis w’as leportcd by iXfoins (1939) of results of 
applying the Piiitner-Patcison Senes, the Svlvestei Form Board, the 
Lincoln Hollow Sejuare, the Whtnei Cylinder Test, the Dearborn 
Form Board 3, the Poneus Maze, the Minnesota Paper Form Board, 
the Henmon-\elsoii Intelligence Test, and the Blown Personality 
Test. From rhiity-three scores foi each ol fiLt)-six boys nine years of 
age, he found three common Iactors One Lictor was identified as 
Spatial Thinking, a second as Perceptual Discrimination, and a third 
as an Ability to Disco\ci or to Use a Rule of Procedure Few of the 
factor loadings were large, and the \ariairccs of many of the tests 
were not w'ell accounted foi by the loadings from these factois. 

A report by Han el I (1939) showed the lelationships betw’cen 
thirty-four tCbis which were applied to ninct)-onc cotton mill ma- 
chine-fixeis in Georgia The entire battery, which required about 
7 hours ol woik, included three of the Minnesota tests the spatial 
relations boards, the assembly boxes, and tlie paper for nr boards 
Among the seven tests from Whitman’s Manual Dexteiiiy Senes 
were the nut and boll assembh, peg boards, and pin boards. Three 
of Crockett's Manual Dextciity tests were used* sciewing nuts on 
bolts, packing blocks, and laying blocks along a strip The seven 
MacQuariie tests and foui of lliurstone's spatial tests weic included 
to represent papei-.incl-pencil techniques. Additional information 
included age, school grade comijlcted, experience on mechanical ]obs, 
foremen's ratings of competence and ineclianical ability, self-iatings 
of interest, and three ol Thui stone's tests of verbal relations. 

From an analysis five factois ajipeared and w^eie LciiLativcly named 
Perception ol Detail, Verbal Relations, Spatial Visiiali/ation, Youth 
or Inexperience, and Manual Dexterity These factors were identi- 
fied from the tests which had the highest loadings in one factor and 
small loadings in other iactors The fust thicc of these factors are 
similar to tho^c described by Thurstone, and reported in Chapter 
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VIII. The Youth or Inexperience factor resulted from classifications 
by age and foremen's ratings. 

Some of Harreirs findings about commonly used tests are contrary 
to usual beliefs. Thus, the Minnesota Mechanical Assembly Test did 
not appear to depend upon either manual dexterity or experience 
on mechanical jobs On the second trial this test lost most of the 
Visualization factor and became more dependent upon the Percep- 
tion of Detail factor Furthermore, the Minnesota Assembly Test, 
the Minnesota Spatial Relations, and the Wiggly-Block Assembly, 
all of which involve the handling of blocks, had practically the same 
factor patterns as the Thurstone Spatial Relations Tests and Sten- 
quist Mechanical Tests, both of which a^e limited to paper-and-pencil 
situations. These results support the hypothesis that performance 
tests which do not require great precision or speed of movement may 
usually be interpreted as tests of perception (comparison of details) 
or spatial imagery and reasoning These two factors were again clearly 
shown to be independent. One demands rapid comparison of ob- 
jects which are directly perceived, and the other requires that one 
imagine how things would look if they were put together or rotated. 

The AAF made extensive studies of tests of visualization, mechani- 
cal comprehension, and motor coordination The results, reported by 
Guilford and Lacey (1946), show that there appears only one visualiza- 
tion factor which accounts for 2-dimensional, 3-dimensional, and 
moving-parts visualization. If the forms are familiar, a visual memory 
factor is important. From a factorial analysis of seventeen tests 
studied, the following seven factors appear (1) mechanical experi- 
ence, shown by information, (2) perceptual speed, (3) verbal reading, 
(4) length estimation, (5) visualization, (6) spatial relations, as shown 
in complex motor-coordination tests and paper-and-pencil mechani- 
cal-movement tests, and (7) general or mathematical reasoning. 

Jones and Seashore (1944) have summarized their own and others’ 
work in measuring the growth and interrelation of motor and me- 
chanical abilities. They point out that there is no evidence of a gen- 
eral factor in fine motor skills, but a great deal of evidence of group 
factors or specific factors. Low intercorrelations are the rule for all 
ages above two years. It also appears that although motor tests are 
subject to large practice effects, the practiced subjects show the same 
independence of motor test results as the unpracticed. 

They further point out that group factors are dependent upon 
similar musculature, similar extent of movement, similar sensory com- 
ponents, and similar time and space patterns. The least variable 
pattern is usually more important in determining correlations than 
the others. 
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This specificity o£ Victors in fine motor skills makes it unlikely 
that specific job jjcilonnance will be well predicted even Irom a 
highly reliable motor test Thus it was louncl that Seashore's batieiy 
of six motor tests had no pieclictive value for typing perlormance or 
for a typical lactoiy macliine operation. Among the best prediction^ 
are those of rifle maiksmanship from steadmesb tests, and of fighter 
pilots* work troin comjilex cooidination tests, but in neither case 
was the motoi test an adequaie basis toi selection. 

Guilfoid and Lacey conclude that while there are numerous bat- 
teries labeled mechanical ability or manual dexterity, there is little 
evidence that such broad tiniiary abilitieb exist or that they predict 
rate or final level of learning ol complex practical skills They point 
out that caiefiil tiaining in attitude as well as in skill will often 
yield curves of development which radically change one's position 
in a group. 

These results indicate clearly that there is no single test of me- 
chanical aptitude or abiht), but that nicchanical work of various 
kinds requires vanous combinations of about seven lairly inde- 
pendent mental abilities, and several inoie sensozy and motor abili- 
ties. The best predictions will eventually come from comparing indi- 
vidual and job profiles. 


APPLICATIONS 

Reviews of the many hundieds ol apj^lications of mechanical tests 
to groups ol apprentices and vroikcrs appear in pciiodicals, paiiic- 
ularly The Psychological Bulletin and 7 he Journal of Applied Psy- 
chology* The AAF volume by Melton (1948) gives a valuable sum- 
mary. Lawshe (1948), who has discussed and pieseiiLecl charts of more 
than a hundred such studies, stresses tlic idea iliai better prediction 
will come when batteries of tests of independent abifitics are eval- 
uated by comparisons with ratings of success in the independent kinds 
of skills needed in a particular occupation He also believed that 
batteries ol primary-ability or aptitude tests will probably not dis- 
tinguish well between candidates loi vaiious skilled tiadcs, but that 
tests of specific knowledge will do this effectively and also correlate 
fairly well with success in apprentice training For instance, tire 
Purdue Test of Electrical Information is as effective as any other 
single test in picdicting success among electiician trainees Similarly, 
the Michigan Vocabulaiy Piofile Test, which gives separate scores 
for eight independent fields of knowledge, was shown by Swartz and 
Schwab (1941) to correspond sigiiificantly to latings of ability among 
thirty-seven research engineers. Three tests Irom this profile, those 
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for physical science, biological science, and mathematics, yielded a 
critical score that included all of the two groups of engineers who 
were rated highest, and none of the two groups who were rated 
lowest, while somewhat less than half of the two middle groups 
reached this critical score. Means and quartile ranges of scores on the 
Michigan Vocabulary Profile Test were issued by Greene (1949) for 
twenty-four occupational groups, showing very significant differences 
between groups The size of his samples, however, makes it neces- 
sary to repeat this work and to secure more valid criteria and sam- 
ples from different parts of the country Simply demonstrating that 
a particular average profile is typical of many successful workers in 
an occupation does not indicate that the profile has much significance 
for individual prediction. It is necessary to show direct relationships 
between profile scores and success or failure on the job. 

Bennett and Fear (1943) reported that operators of lathes, grinders, 
milling machines, and Bullard Automatics took Bennett*s Test of 
Mechanical Comprehension and a hand-tool dexterity test. All of 
those scoring in the upper 20 per cent on both tests were rated as 
average or above average on the job, while only 14 per cent of those 
who were in the lowest 30 per cent according to their test scores were 
so rated. 

Crissey (1944) reported similar results when he applied the Minne- 
sota Spatial Relations Test and two peg boards to a group of tool- 
setters Sixty-nine per cent of those in the highest third of the com- 
posite test scores were rated as above average, while only 30 per cent 
of the middle third and none of the lowest third were so rated. 

Jurgensen (1943) reported that the Minnesota Rate of Manipula- 
tion Test (Ulus. 6), as revised by Ziegler, showed a correlation of .60 
with the combined ratings of three supervisors on the speed of work 
of sixty operators of converting machines in a paper mill. 

Tiffin and Lawshe (1944) report that employees of a hosiery mill 
with the poorest finger dexterity, as measured by the Purdue Peg 
Board, cost a company $59 each in minimum make-up before they 
“made the rate,” while employees in the best dexterity group cost 
only $36.40. The authors believe that tests can indicate who should be 
trained, where training should start, and how adequate the training 
has been. 

Knowles (1945) reported that a battery containing one written 
test of ability to learn, a carefulness test of sorting metal pieces, and 
one assembling-of-bolts test was effective in selecting general airplane 
mechanics. Of the highest half of those tested by this battery, 39 per 
cent were later rated as good, 58 per cent as fair, and only 4 per cent 
as poor. 



MECHANICAL AND MOTOR TESTS 


293 


Accident proneness was detected by Williams (1943) by a series 
of manipulation tests which included dotting of small circles on a 
revolving disk, reaction to \isual or auditory stimuli by pressing the 
correct button, and hand-arm steadiness and strength. Those with 
scores in the lowest quarter had approximately twice the accident 
rate of the average of those in the other three quarters. 

Although these studies yield practical results, most of them ex- 
hibit two rather senous shortcomings: they are based on rather small 
samples and use only one or two tests, so that comparisons to show 
which are the best tests are seldom possible. A large amount of re- 
search IS now going on, which will eventually yield unique and more 
valuable measures. 


STUDY GUIDE QUESTIONS 

1 What sorts of independent mechanical skills, or factors, can be 

measured by paper-and-pencil tests? , 

2 How do tests of mechanical principles differ from those of mechanical 

knowledge? 

3 What evidence is there that pictorial tests are superior to verbal 

tests in measuring mechanical principles? , • 

4. What are the principal skills needed in a pure test of visualization? 

5. Distinguish between tests of mechanical knowledge and tests of me- 
chanical principles 

6 AV'liat IS the principal characteristic of visualization tests 

7 ^Vbi^L aic the main qpcs of inotoi-cooulinaiion tests^ 

8 Hou is reaction tunc 'related to the sensoiy organs and to age^ 

0 \\ hat is iiKlucled in developmental tests ol agilits^ 

10 ITow do ballistic niosements diller from precision mosementi 

11 W'hai tests aie available to measure motor rlqihrn^ 

12 What faciois lia^c usually appealed from analyses of batteiies oL 

mechanical and motor tests- i i i 

13 What are the usual components of batteries ol mechanical-abihty test 
H What are tbe pnncipal \isual functions and how aie they measured 

15 What aie the usual methods of measuring agihty and strength. 

16 What aie the mam varieties of dexterity tests^ 

17 How arc moioi ihsthms best measureeP 

18 What aie the usual components of batteries of motor tests> 

19 To what extent aic mental and motoi tests usually coiielated 

20 Are the \aiious measures of steadiness highly related^ 

2 1 What pi ai tice effects arc c ommonly found in motor test s^ 

22 What factors appeal most frequently in tests ol mechanical abilities 

23 How well do l)atteiies of tests predict success m mechanical and en- 
gineering work? 



CHAPTER X 


TESTS OF SPECIAL 
APTITUDES 




In the four preceding chapters measures of general aptitude and of 
primary abilities or aptitudes, measures of achievement, and measures 
of mechanical and motor skills were presented. Clerical and profes- 
sional aptitudes were included in Chapter VII because they seem 
closely related to academic achievements. Measures specifically de- 
signed to appraise aptitudes for particular kinds of work or avoca- 
tions remain to be illustrated. Measures related to vision, hearing, 
music, and graphic arts will be discussed here. 

CHARACTERISTICS OF TESTS 

Measures of special aptitudes are generally designed to give evi- 
dence in a narrow area of skills that are thought to be basic to a 
particular type of work. Most of them, therefore, seek to exclude 
factors of learning, intelligence, reasoning, and perceptual speed, 
which are fairly general and apply to many situations. Measures of 
aptitude are often confused with evaluations, or preference, or ap- 
preciation. While appreciation seems to be somewhat related to 
ability to recognize or to produce artistic or other works, certainly 
many are able to define and recognize different forms of art and 
music who cannot produce them and have no particular preference 
for one form There are at least four little-related aptitudes here. 

а. To define or recognize differences, perceptual and conceptual. 

б. To construct original work. 

c. To perform as in music, dance, drama. 

d. To appreciate or prefer. 
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The last of these, appreciation, is a dynamic factor. (See Chapters XX 
and XXI.) The first tliree are dealt with here. The measurements of 
two sensory functions, vision and hearing, are reported, then ap- 
praisals of ability to recognize differences, to compose original work 
and to perform in music and graphic art. 

VISION 

The most widely used sense modality in everyday living is vision, 
and consequently any impairment of it is likely to cause serious 
need for remedy or adjustment While various surveys show only 
about five-hundredths of one per cent blind persons in the United 
States, approximately %o ^^.ve part sight (2%^ to 2%oo)» 

and about 30 per cent whose vision is seriously reduced, so that 
they need glasses or some other coriection. 

Although blind persons can succeed in many fields of study and 
in man) occupations, a laige number of iiidustnal and business 
jobs icquue constant use and much skill of some visual function 
'rhese functions include the following. 

a. Acuity, Acuity is the ability to see a pattern distinctly at a 
ceitain distance, with each eye separately and with both e)es to- 
gether. 1 his depends upon the shape of the eyeball, lens, and cornea, 
and obstructions in the path of light after it enters the eye, as well 
as the nerve connections to the brain 

b. Phojta 'Ihe musculai balance of the external muscles of each 
eye is called phoria Imbalance may tui n one eye \ci ti rally or laterally 
more than the otlier. Many people have some imbalance which may 
cause e)C strain in close work In extreme cases, known as cross eye, 
only one eye is used at a time 

c. Binoculai depth perception. Careful laboratory tests show 
that binocular depth perception is usually nearly eight times as good 
wdtli two good eyes as with one eye It is accomplished in the brain by 
the fusion or blending of different images horn tw’O good eyes. 

d. Color vision The ability to distinguish fine differences in 
shade and hue is called color vision It is dependent upon photo- 
sensitive cells in the retina of the eye The best color-vision tests 
involve matching samples for both brightness and saturation, as 
well as reporting afterimages of colors. 

e. Eye dominance , Many persons have a strong prefeience for 
sighting w'lth one eye rather than with the other. The preference 
is due to the relatne acuity, phoiia, lateral brain dominance, and 
possibly to other 1 actors 

The medical profession has developed simple cliaits as well as 
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elaborate devices for measuring acuity, phoria, and other visual 
functions. There are a number of tools for nonclinical use by a lay- 
man. 

One of the best single instruments now available for school, in- 
dustrial, or business use is called the Orthorater. It yields scores 
on all five functions in from 3 to 6 minutes of minimum testing time. 
The Orthorater, shown in Ulus. 104, consists of a stereoscope, lenses 


ILLUS. 104. THE ORTHORATER 



(By permission of Bausch and Lomb Optical Co., Rochester, N.Y.) 


for near and far adjustments, and a set of stimulus cards with standard 
illumination. It is composed of twelve separate tests: seven for far 
vision, the optical equivalent of 26 feet, and five for near vision, 13 
inches (Ulus. 105). Each of these tests has been proved to be related 
to successful performance on various types of occupations. Individual 
profile sheets are quickly compared with a job profile sheet which 
shows shaded areas for the unacceptable scores. Thus in Ulus. 105 
the individual's scores meet the visual requirements of the job in all 
respects except for acuity of the right eye and for vertical phoria. 

Tiffin (1947) and his colleagues have reported many careful applica- 
tions of tests of vision in school and industry and show good results 
from more carefuT selection of workers for situations where good 




297 


TESTS OF SPECIAL APTITUDES 
ILLUS. 105*. ORTHORATER SCORE CARD 


VISUAL PERFORMANCE PROFILE 



(By permission of Bausch and Lomb Optical Co., Rochester, N.Y.) 


vision is required. The Orthorater can be operated by a careful 
layman with little training. 

Davis (1946) reported correlations between clinical tests of acuity 
and phoria and the corresponding Orthorater scores for 32 women 
and 63 men, ranging from sixteen to sixty-five years. The correla- 
tions (p. 598) were as follows: 



Far 

Near 

acuity, both eyes 

.82 

.71 

acuity, right eye 

.76 

.64 

acuity, left eye 

. .82 

.70 

lateral phoria 

.53 

.64 


These figures show correlations between the two procedures which 
are about as high as the reliabilities for each procedure will allow. 
In general the acuity scores show close relationships. The lateral 
phoria scores are too low to permit predictions, which in this instance 
was due in part to the rough measurement of phorias by the clinical 
method. 

Another somewhat similar instrument, the Betts Telebinocular, 
is a stereoscope mounted on a stand with a movable rack for holding 
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cards It requires more technical competence than the Orthorater, 
but this can be acquired by a layman. It yields reliable scores for near 
and far vision, muscular balance, binocular fusion, and eye domi- 
nance 

A still more simple apparatus is the Eames' Eye Tester, which pro- 
vides a series of cards and a hand stereoscope which can be used 
by a teacher, after a little training, to detect eye defects that need 
professional care. It yields fairly reliable results for near and far 
vision, astigmatism, binocular fusion, and eye dominance. 

Good color vision is needed by artists and by certain technical 
workers, for example, cable splicers and chemical technicians In the 
most common type of color blindness which is present in about 4 
per cent of males and less than 1 per cent of females, there is a re- 
duction in ability to distinguish red, purple, and green from a gray 
of similar brightness. Only a few people are unable to distinguish 
blues from yellows Exact measurement of color blindness is difficult 
because the brightness and texture of the colored objects often give 
clues which allow persons to pass the test even though they may not 
have good color vision. 

Ishihara (1939) published a booklet of colored spots printed on a 
good grade of cardboard, in which numbers and figures are deline- 
ated by dots of different color but of the same or nearly the same 
brightness as the background. While this test is good for rough, 
quick screening, it is not very satisfactory for diagnostic work. Some 
color-blind people can pass it. Another widely used test, the Holmgren 
Yarn Test, employs colored skeins of wool, which the subject is 
asked to sort by colors. The Farnsworth Dichotomous Test for 
Color Blindness (1947) and the Farnsworth-Munsell 100-Hue Test 
for Anomalous Color Vision (1947) are made of colored plastic and 
printed on mat paper. All of these tests have colors that are likely 
to fade in time. 

The best tests of color vision are made with pure spectral lights 
thrown on a white surface which has no observable texture. Such 
tests, however, are only made in specially constructed laboratories. 

HEARING 

Next to vision, hearing is probably the most important means 
of contact with environment. The two main dimensions of hearing 
are intensity and tone or pitch. Many people are deaf to certam 
tones, particularly high tones, and to certain intensities, usually the 
extreme intensities. Most deafness is caused by infections or injuries 
to the middle ear, which reduce the sound vibrations reaching the 
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inner ear. The best instruments for measuring loss of hearing are 
called audiometers, and several good models are now available from 
the Western Electric Company of New York and the Maico Com- 
pany of Minneapolis, When using the most common type, a group 
of listeners put on headphones, then write the numbers which they 
hear spoken. A phonograph record furnishes the stimuli to the head- 
phones. One Western Electric Company model requires the indi- 
vidual being tested, when he no longer hears a tone in his head- 
phones, to push a button and thus make a small electric light shine. 
The examiner regulates the intensity and frequency of vibrations by 
dials and records the results. This model has the advantage of being 
adapted for testing small children and also of including the ex- 
tremely high frequencies (Ulus, 106). 


ILLUS. 106. AUDIOMETER 



(By permission of the Western Electric Co.) 


' The results may then be charted showing the range of tones which 
ciri he heard with various degrees of loudness. Audible tones range 
ttirough about ten octaves on the piano, from 30 double vibi'ations 
a second to 20,000 double vibrations. Most conversation does 
ttbt r&ge below 100 d.v. or above 7,000 d.v. Intensities in decibels 
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are generally measured downward from an' arbitrary zero placed at 
normal hearing. Inability to hear intensities of about 60 decibels 
below normal is considered deafness A loss of from 20 to 60 decibels 
would be considered haid of hearing, 

MUSICAL APTITUDE 

There are three kinds of musical-aptitude tests discrimination 
and information, rendition, and composition. The well-standardized 
tests in the field are for discrimination and information, but some 
tests of rendition will also be described. 

Discrimination 

Probably the most widely used tests of musical discrimination are 
those of Seashore (1919). By means of standard phonograph records 
one may secure scores of accuracy of pitch, loudness, rhythm, conso- 
nance, and tonal-sequence discrimination. The 1939 revision of the 
Seashore tests has three levels of difficulty or series Series A is for 
unselected groups. Series B for musical groups, and Series C is a 
still more refined instrument for individual testing. Three 12-inch 
phonograph records are provided with a complete test on each side 
of each record. Separate scores are secured by standard testing pro- 
cedures for: 

1 Pitch: discrimination of small differences between the frequencies 
of vibration of two tones with intensities and duration held constant 

2. Intensity, ability to judge which of two tones are the louder, with pitch 
and duration constant 

3. Time: ability to indicate which of two short periods is the longer. 
The limits of the periods are marked by a pure tone of short duration 

4. Ttmbie discrimination between two complex tones which have the 
same total energy, but which differ in the application of the energy 

5 Tonal Memory: ability to indicate which note in a short melody has 
been changed in its repetition. One note only is changed by a whole 
tone 

6 Rhythm- the ability to discriminate short rhythmic patterns 

The scores of fifth-grade, eighth-grade, and college students (nearly 
one thousand of each) were tabulated by Seashore, and then con- 
verted into centile ranks. The median scores showed marked differ- 
ences although the groups overlapped almost 100 per cent For in- 
stance, in Pitch the median fifth-grade pupil had about 67 per cent 
of items correct, the eighth-grade pupil, 78 per cent; and the college 
student, 81 per cent The eighth-grade scores were thought to be 
typical of average adults. The reasons for differences among these 
groups are not clear It may be that the younger groups make poorer 
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scores because of inability to understand the directions or unwilling- 
ness to pay attention On the other hand, it may be that acuity of 
discrimination improves by maturation during adolescence. 

The effect of repetition of the Seashore Measures of Musical Tal- 
ent was reported by Farnsworth (1931) to be small under the stand- 
ard test conditions. C E. Seashore (1919) and Brennan (1926) also 
reported that special musical training has little effect upon test scores. 
Whipple (1903) and R. H. Seashore (1926), however, reported marked 
improvements ihiougii ti.unmg in pitch discrimination and rhythm 
icspectively More icscaich is needed to show the limits oi individual 
impiovcmeut, paiticulaily among those who at present make low 
scoies because ol pool interest and lack ol oppoituiiit) to hear music 
which has been accurately leproduced 

A nunibei of similai tests have been picpaied Kwalwasscr and 
Dykema (1030) published ten phonographic tests, the fii^i seven of 
which are much like the Seashore tests. 'I’he last tiiree involve judg- 
ments ol pleasantness and association oi images with tonal and 
iliythmic sequences Ortmann (1929) devised tests oi discrimination 
oC pitch, intensity, and fusion, and oi memory ior pitch, rhythm, 
melody, and haimony Tests of absolute pitch have been described, 
but not )ct siandardi/cd Bachern (1937) tested 103 pei'soiis foi ab- 
solute pitch by asking them to identiiy notes siiuck on a piano Seven 
of them were correct ior every piano note, and ior othci musical in- 
sLiumenfs as well Forty-four made only small eirors, or enois of ex- 
actly one octave He proposed a tlicoiy which makes absolute pitch 
dependent in pail upon “tone cliioma” within the octave. 

Information 

Several information tests have been standardized Those of Kwal- 
wasser are tvpical of the best I'he Kwalvvasser-Ruch (1024) Test of 
Musical Ac^mphshment was designed to measure ability to read 
musical notation from the fourth to the twelfth grade. Ten separate 
tests were piovided to indicate. 

1 Recognition of names of musical symbols 

2 Recognition of names of notes in a scale 

3 Detection of pitch errors in a faiiiihar melody 

4 Dciection of time errors in a familiar melody 

5 Recognition of yiitch names 

6 Knowledge of key signatures 

7 Knowledge of time sigiiatines 

8. Knowledge ol note values 

9 Knowledge ol lest values 

10 Recognition of familiar melodies from notation 
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The items were arranged within each test according to order of 
difficulty, and the mean scores for four thousand children were re- 
ported. Most of the tests showed steady progress throughout the 
grades, but Kwalwasser found some evidence that the last two tests 
showed little improvement beyond the seventh grade. Girls did 
slightly better than boys, but the average child was found to be seri- 
ously lacking in most of the simplest fundamentals 
Another test by Kwalwasser (1927) measures verbal information 
about composers, compositions, and instruments About two hun- 
dred true-false and completion items, such as those shown in Illus. 
107, were designed for adults who had taken courses in musical ap- 
preciation 

ILLUS. 107 MUSICAL VOCABULARY TEST 
Test 9 

Directions: The statements found below are either true or false Read each state- 
ment carefully and then underlme truey if the statement is correct, or/o/^e, if it is 
incorrect. Tlie sample is marked as it should be ; 

Sample: Allemafide is a German dance True False 


1. The fantasia is a composition of strict form True False 

2 The polonaise is a Polish dance . True False 

3. The oratono is a scnptuial or epic story set to music .... True False 

4. The coda is found m the middle section of musical compositions True False 

5. Beethoven substituted the rondo for the scherzo form m the 

sonata True False 

6. An etude is a study or lesson True False 

7. A waltz is written m 2-4 time True False 

8. A minuet is written in 3-4 time True False 

9. A scherzo is wntten in 2-4 time True False 

10. A note represents a time value as well as pitch True False 

11 A phrase is longer than a period True False 

12 Polyphonic means hterally “many voiced'^ True False 


(Kwalwasser, 1927, p. 97. By permission of the Bureau of Educational Re- 
search, University of Iowa) 


Rendition 

Ability to play or sing artistically has been the subject of a num- 
ber of analytical studies Seventy-four reports of appraisals of the 
sound of spoken prose or poetry were reviewed by Metcalf (1938) 
Early in the century Scripture (1902) used a vibrating membrane to 
transfer the vibration of the voice to a kymograph record. Shepard 
(1913) enlarged this apparatus to include separate records for nose 
and mouth of vibration and the passage of air. Seashore (1927) de- 
scribed an instrument which recorded photographically the pitch 
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and intensity of \oraludtion o\er long periods of time. These instill- 
ments have been used both for the analysis of the time and intensity 
pattern of a passage and [or the compaiison of indiv idiial renditions 
No simple methods ol appraising either a passage or its rendition 
have appeared as yet BnkJiofl (19S3) proposed an elaboiate formula 
for comparing the acstlirtic value of passages, in ^\hich Oideiliness 
is divided by Complexity Orderliness refeired to alteration, asso- 
nance, rhyme, and musical vowels, and Coniplexit) was the total num- 
ber of speech sounds plus the number of w'ord juncUiies which did 
not admit of liaison Beebe- Canter and Pratt (1937) reported a cor- 
relation of 75 between BiikholPs aesthetic measure and students’ 
preferences for nonsense pa-^^ages When meaningful passages w*ere 
used, the correlation dropped to 08. 

Studies ol the artistic lendition ol a passage are illustrated by 
Schramm (1935) He reported that short ihuhmical intervals in 
poetry may de\iaLe from equality b^ as much as 24 per cent w'lthoiit 
impairing rhythmical effectiveness, and listed a large number ol 
factors which contribute to rhythms, among them repetitive chai- 
acter of melodies, pitch of related or rhymed syllables, and precision 
and slurring of enunciation, ot syllables, and of phrases The best 
artists were considered to be those w'ho knew the exact metrical 
rhythms but who deviated from them in a character is tic manner 
which usually defied any simple iormulation 

Standard tests ot sight-singing ability have been devised by Gaw 
(1928), Salisbury and Smith (1929), and otheis The scoies aie a 
summation of cirois in pitch and lime, as well as omissions, hesita- 
tions, repetitions, and extra notes Exj^erienced judges are needed 
for careful examining, but the method yields fairly reliable scores. 

Rendition ol piano and Molm music has been extensively studied 
and reported in a series of monogiaphs Irom the State University 
of Iowa and also in Seashore’s (1938) Psychology of Music. Photo- 
graphic recoids of pitch, intensity, and timing have been made with 
great accuracy, and have led to minute compaiisons of the perfoim- 
ances of great artists as w^eil as otheis. 

Correlational Analyses 

The reliabilities of tests of musical discrimination and pieference 
have been extensively icpoited by Farnsworth (1931), who sum- 
marized eighty-eight published studies He concluded that Seashoic’s 
test of pitch and total memory appeared to have sufiicient leliability 
for diagnostic purposes The Kwahvasscr tests and the remainder of 
the 1919 Seashore tests “should be employed with extreme caution.” 
The following iciest reliabilities were reported for the Seashore tests 
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pitch, .75, intensity, 66; time, .51; consonance, .65, rhythm, .47; and 
memory, .83. The Kwalwasser tests of melody and harmonic prefer- 
ences had reliabilities of approximately .42 and 21 respectively. The 
entire Seashore battery had a retest reliability of approximately .88 
for large single-age groups. 

The intercorrelation of the various tests showed similar results for 
groups of elementary school and high school pupils and adults in 
several studies. The median intercorrelations of the Seashore tests 
were approximately .48 for college gi'oups and 25 for lower grades. 
Tests of Tonal Memory and Pitch showed somewhat higher correla- 
tions, as did tests of Rhythm and Time, and Rhythm and Tonal 
Memory. Nearly zero con*elations were reported between tests of 
Intensity and Consonance. These correlations are, of course, affected 
by the different reliabilities of the separate tests, and they would be 
higher if the tests were more reliable. 

The Kwalwasser tests of preferences for Melody and Harmony 
correlated with each other .40 in an eighth-grade group and 29 in 
a fifth-grade group. The Seashore Tonal Memory Test correlated with 
the Kwalwasser tests a little better than the other Seashore tests, but 
all the intercorrelations were rather low; the median was .16 for the 
eighth grade and 17 for the fifth grade. The fact that adult and col- 
lege groups usually show higher reliabilities and intercorrelations 
than younger groups is thought to be best explained by their better 
ability to pay attention and to follow directions. 

The Ortmann Musical Discrimination Tests are reported by 
Petran (1937) to have reliabilities similar to the Seashore tests. 
Among a sample of 500 students, the total scores had a retest reliabil- 
ity of 80. The highest retest reliability of a single test was .86 for 
Pitch Memory and the lowest was .30 for Melody Memory The 
intercorrelations of the separate tests ranged from .11 to .47. 

The correlations between tests of intelligence and of musical dis- 
crimination have usually been found to be low and positive Farns- 
worth (1931), summarizing sixteen such studies, reported coefficients 
between the separate Seashore tests and group mental tests ranging 
from .45 to ~~.08 among high school and college groups. The median, 
which seems to be approximately 10, would doubtless be higher if 
feeble-minded and brilliant students were included. 

Applications 

In Seashore's pioneer work (1919) an elaborate series of thirty tests 
and ratings were combined into profiles such as that shown in Ulus. 
108. The first five tests were measures of fine discrimination; the next 
three involved self-estimates of clearness of memories of sensations. 
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* The Items marked with an asterisk represent 
estimates in the absence oi norms, 

(Swshoie, 1919, p 19 B) pniiiissioii ol Sihei, Biiidru and Co) 

Tests 9 to 19 inclusue were measures ol luotoi reactions Items 20 
to 24 api^raiscd \oral rendition Items 25 to 28 were measures o£ 
simple and complex learning of note associations, and Test 29, a 
standard IQ ii am a Rinet test Item 30 was a lacing ol emotional ic- 
action to music The chart is staled in deciles ol an age group, and 
the profile was made from records of a ten-) ear-old girl with remarlca- 
ble musical talent Since 1919 iicaily all the tests have been made 
more accurate and some attention has been given to improving rat- 
ings In addition, tests have been devised lor pieleiences among 
musical compositions 

Elaborate profiles have not been used widely because of the dif- 
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ficulty of constructing them, but nearly everyone who has used them 
cautiously has reported that they yield information which is valuable 
to both instructor and pupil. Most of the reports of musical measure- 
ments have dealt with the correlation of a few discrimination tests 
with grades or ratings in various sorts of musical appreciation or 
skills. A typical report is that of Peti^an (1937), who reported that 
total scores on the Ortmann tests of tonal discrimination and mem- 
ory correlated 57 with average giades in the Peabody Conservatory 
of Music, The test included measures of discrimination of pitch, 
rhythm, and fusion, and measures of memory for pitch, rhythm, 
melody, and harmony. The highest correlation of a single test with 
conservatory grades was .45 for Chord Memory, and the lowest was 
,25 for Rhythm Memory Larson (1938) reported enthusiastically 
on a testing program in the Rochester, New York, public schools and 
the Eastman School of Music, The Seashore test scores correlated 
.59 dr .04 with grades in the first course of musical theory in the 
School of Music. 

There have also appeared some critics of the uses and alleged 
claims made for the Seashore tests Moos (1930) believed that they 
are not measures of musical talent because they stress sensory pat- 
terns rather than complex compositions and because they have been 
found to give chiefly negative results. Pratt (1931) felt strongly that 
they utterly failed to get at the kernel of musical talent, although he 
admitted that they were good tests of sensory discrimination He 
pointed out that the usually low correlations between discrimination 
tests and ratings of musical ability indicated a nearly chance rela- 
tion between test scores and achievement Mursell (1937) quoted a 
number of studies which showed low correlations between Seashore 
tests and appraisal of musical ability, and concluded that the tests 
had been shown to be invalid. 

The writer feels that although some exaggerated statements have 
undoubtedly been made about the usefulness of the tests of musical 
discrimination, still they are of value in appraising skills which may 
be of importance in predicting some aspects of musical success. It 
seems probable that particular tests of musical discrimination will 
predict success in particular types of musical skill better than 
in musical ability of all types Analytical appraisals of musical 
memory and complex skills are needed as well as careful studies of 
their growth. 

Although musical discrimination and verbal intelligence tests 
show small correlations, both have been found to correlate signifi- 
cantly with success in schools of music. Highsmith (1929) found that 
grades m a college of music correlated .423 with scores in the Terman 
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Group Intelligence Test, but only .312 with the Seashore tests. 
Stanton (1929) found it advisable to add a college intelligence test 
to the Seashore Tests of Musical Talent for purposes of selecting 
students for the Eastman School of Music. 

Two reports which deal with scores of various racial and age 
groups are included here The results were obtained from such small 
groups that they are in need of supplementation. Scores of South- 
ern Negroes on the Seashore Tests of Musical Talent were dis- 
cussed by Bean (1936) He reported that the Negroes in high school 
and college were found to be equal to the whites only in tests of 
rhythm. In all other tests they were definitely inferior. Lenoire (1925), 
however, reported that two hundred Negro children in the fifth 
grade had mean scores on the Seashore tests, which were definitely 
superior to those of a similar group of white children in rhythm and 
tonal memory, and not inferior in any of the other tests. 

GRAPHIC ARTS 


Types of Tests 

Piefeiejtce for Pichnes or Design, The inclusion of preference 
tests under the heading ol aptitudes may be questioned by some on 
the ground that preleienres are essentially dynamic and not very 
closely related to ability or aptitude These tests are included, how- 
ever, because in many tests marked perceptual and analytical skills 
arc used as a basis for piciciences by many persons, particulaxly by 
those IV ho do well on the tests 

Jn the field ol appreciation of graphic ait, two tests have been 
developed and widely used the i\lcAdory \rt Test (1929) and the 
Mcier-Seashorc Art-yudgment Test (1930, 1940) Both of these re- 
quire that a choice be made between samples which are presented 

The ^^cAdory Art Test consists of seventy-two plates each of which 
contains four small variations of one picture (Ulus- 16) The original 
drawings were taken from current an and trade magazines, and the 
variations involve changes in proportions, intensity, and color, which 
are described on the record sheet The subject records his older of 
piefercnce (or each plate. One point is given loi each pictuic ranked 
according to a key which represents the judgments of 100 experts 
The cxpcits included artists, arcliirects, art teachers and critics, art 
buyers, and lay ciitics All the ke\s used were agreed upon by at least 
64 pci ccnl of the experts Separate scores can be had for the total 
test and also foi six subdivisions. (1) furniture and utensils, (2) 
textiles and clothing, (3) architecture and related arts, (1) shape and 
line arrangements, (5) massing of dark and light, (6) color schemes. 
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The Meier-Seashore Test (1940) consists of 100 small uncolored 
pairs of pictures (Ulus. 109) and a record sheet. One member of each 
pair of pictures is a reproduction of a recognized masterpiece chosen 
from landscapes, pottery, portraits, oriental drawings, woodcuts, 
murals, and medallions. The other member of each pair is a repro- 


ILLUS. 109. ART JUDGMENT TEST 



San Juan Bridge. The relative proportions of the bridge and the 
city have been altered. 

(Meier-Seashore, 1930, No. 78. By permission of the Bureau of 
Educational Research and Service, University of Iowa.) 

duction of the masterpiece altered in some respect. As these altera- 
tions are noted on the record sheet, the examinee's attention is called 
to them. They include changes of position of an outstanding object, 
of background of distribution of light and shade, of horizon line, of 
perspective, of quality of line, and of the use of curves. Although 
there are no time limits for the administration of this test, it usually 
takes from 40 to 60 minutes. The score is the number of choices 
which correspond to a key. The key was made from the consensus 
of various artists, sculptors, directors of art training, and art teachers. 
The items are arranged roughly in order of difficulty of discrimina- 
tion by experts. 

From the application of the McAdory and Meier-Seashore Tests by 
various investigators to fairly large numbers of persons, it has been 
found that 

1. Retest correlations ranged from .71 to .93, and odd-even correlations 
of .59 and .65 were found (McAdory, 1929, and Meier, 1939). 

2. The Meier-Seashore and McAdory Tests correlated with each other 
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37 in small groups of college or art students (Cairoll and Eurich, 1032, anti 
^Vallls, vm) 

3 Age and giade norms show iinj>rovcnicni in the Mcier-Seashore Tests 
from nearh chance mean smecss iii the eighth grade to the 87 out of 125 
Items lor art teachers Nouns for ihc Afe Vclory Test ha\c been secured ironi 
the thud glade up 

4 Women exceed men in mean sroies on both tests by small but faiily 
significant amounts (Carroll and ImiucIi, 1932; 

5 Ai lists and studenis with aitistic training ha'vc asciage scores which 
are significanih highci ihan similar college students ■wiihout sucli training 

fi Studies of Mexican childicn (Stolz and Manuel, 19SI) showed them to 
ha%e about the s.iine setnes as non-Mcxican childicn of the same general 
environments and age, both groups ha\ing sery low mean scores St(‘ggeida 
(1936) found 300 Na\a]o Indian children, ages 11 to lb, to he far below the 
Mc\doj\ norms of New \oik Caty whites, and conduded that the test does 
not reveal the artistic abilit) of these Indians 

7 Correlations between scores on intelligence tests and art appieciation 
tests arc usually between 07 and 26 (Tiebout and Meier, 193b) 

8 0)11 elation s between art and appieciation tests and B<‘i nreiitcr’s 
intro\ersion, submissiveiicss, and emotional stability tests w^eic all nearly 
zero, among 218 stucl<‘nts (Carroll, 1932) 

Other tests which illustrate interesting approaches will be brieQy 
described Maitland Graves (1948) published a 90-Jtem test ot ap- 
preciation of non representative graphic art, railed the Design-Judg- 
ment Test. Nonrepicsentativc art was used in order to avoid specific 
peisonal reactions to specific objects The test w^as devised to indicate 
the degree to which a subject perceives or at least naively responds 
to figutes which have dificient degrees oi goodness according to the 
“basic principles of aesthetic older —unity, dominance, variety, bal- 
ance, continuity, symmeuv, proportion, and ihythin” These pun- 
ciples v\eie explained m an eailier book by Graves (1941), The Art 
of Color and Design 

The examinee is asked to look at each page in a 5- by 9-inch 
booklet and indicate on a separate answer sheet the design which 
is preferred. Fight ol the pages have three designs each and the other 
82 pages tw'o designs eacli, so one can theoretically get about 44 cor- 
rect by chance 

The test booklet is a beautilul piinting job, the backgrounds and 
figures arc done in contrasting flat white, black, and gray The de- 
signs show some tcxtuie, and filtcen are diav\n oi photographed to 
represent three dimensions Basic themes and identical designs ap- 
pear a number ol times on seveial pages, always changed, however, 
by acliromatir pattern or rotation and shown in combination wdth 
different lorms (Ulus. 1 10) 
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The test was constructed by trying out 150 items on groups of 
art teachers, art students, and others in both related and unrelated 
fields. Items which showed (1) agreement among art teachers, (2) 
significant differences between art and nonart students, and (3) the 
higher correlations with total test scores (internal consistency) were 
retained. 

ILLUS. 110 GRAVES DESIGN JUDGMENT TEST, ITEMS 51 AND 74 




(By permission of Maitland Graves and The Psychological Corporation.) 

The scoring is simply the number of correct preferences Among 
fourteen groups of students reliability computed from split-half 
scores ranged from 81 to .86, median .83 The validity of the test 
is shown by its application to various groups of students not in- 
cluded in the original validation. Groups of college students of art, 
architecture, and illustration all showed means in the neighborhood 
of .75, while other college students averaged about .46. High school 
art students had a mean of .56 and nonart .38. These figures lead 
to the conclusion that much more than half of the high school popu- 
lations studied prefer the poorer art. 

The Whitford Test included fourteen preferences to be made in 
15 minutes. Appropriateness of form, line, proportion, rhythm, color, 
and perspective were to be judged. Tests of one thousand Chicago 
school children showed that the average fourth grade pupils in “supe- 
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rior*' schools exceeded the scores of average eighth grade pupils in 
“medium” schools. 

The University of Wisconsin Test features three aspects of art: 
unity, proportion, and fitness Nine plates were prepared, eacli with 
five small pictures: a “perfect” picture and four variations. Tests 
of high school and elementary grades showed progress from prefer- 
ences of incongruous pictures in the early grades toward preferences 
for “good art” at the later ages. 

Christensen’s Test included 105 plates with four colorless pictures 
to a plate. Separate scores were given for five parts: (1) paintings of 
groups of persons, (2) pictures of an individual, (3) architecture and 
sculpture, (4) industrial arts, and (5) designs intended to illustrate 
abstract art. 

Bird (1932) believed that these tests indicate conformity to con- 
ventional or commonly experienced products rather than true ap- 
preciation of an aesthetic sort. Pmtner (1918) and Berliner (1918) 
found that confoimiry to convcniional pictoiial repicsentation was 
well developed at the age oJ seven and was almost completely devel- 
oped at ten, since the ranking oi pictures by ten-year-olds was simi- 
lar to that ot average adults. 

Voss (1936) found that children in the second, third, fourth, and 
fifth grades could be taught abstract principles of art composition 
bv having dilTerciiccs between a poor and a good composition pointed 
out to them As aesthetic judgment improved, pictuies W’eie com- 
prehended less often as substitutes for objects and more olten as 
rcpicsentations ol ideals or moods 

Korpetli-'l ippel (1935), w’orking in Vienna, reported that below 
the age of three years the prcdoiniiiaiit attitude toward pictures is 
nonaesthetic Between the fourth and tenth yeais aesthetic prefer- 
ences develop gradually, and the ele\enth year shows a marked 
transition. Nearly all the foui teen-year-olds select the more artistic 
pictures ol the pairs presented 

Theie is no way of indicating fioin most of these tests what the 
nature of aesthetic experience is. All one can say is that under the 
test conditions certain picieiences were indicated Whether or not 
a particular person responds to one or several aspects of a picture 
cannot be known unless w^e go further and try to find the reasons for 
his preferences. 

Responses to Lines 

Closely allied to the evaluations ol a wliole design is the aesthetic 
evaluation of small fragments oi a design Three studies, using both 
matching and reproduction techniques, illustrate this approach. 
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Poffenberger and Barrows (1924) asked five'hundred adults to look 
at eighteen printed lines, and then to choose, from a list of words, 
one which described the way each line made them feel (Ulus. 111). 
A marked tendency was found for a slow descending curve to be 
indicative of sad, lazy, and weak; slow horizontal curves, gentle and 


ILLUS 111. FEELING VALUE OF LINES 



I.IST or WORDS 

nelancholy* nournful « 
eful, sorrowful 


seren 


b. Gain, tranquil, 

serene 


Lazy , indolent, idle 


Merry , cheerful, gay, 
jolly, joyous 


Agitated, excited, 
ziery, orisk, viva- 
cious, lively 


Furio 


, angry, cross, 
enraged 


Dead , dull 


nayful 

leak , feeble, faint, 
TWricate 


Qentle . aild 

Harsh , hard* cruel 

Serious , solean, grave, 
earnest 


Powerful . forceful, 
strong 


Kote: One word is to be chosen fron the list to 
indicate how each line nskes you feel. 

(Poffenberger and Barrows, 1924 By permission of the Journal of Applied 

Psychology ) 

quiet; medium rising curves, merry and playful, rapid rising angles, 
agitated, furious, and powerful. A conclusion was reached that “di- 
rection of line was generally most important, rhythm next, and form 
least in this particular study/' 

In the Guilford’s (1931) study, twenty-four adjectives were selected 
and then printed with a blank space. The instructions were: 

In the space below you will find a list of adjectives. Take each in turn. 
You are to think of its meaning and then draw a single line which best 
expresses the meaning that it conveys to you. Your lines will be graded in 
their general form, direction, and heaviness. You will have ten minutes in 
which to complete this test. 
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The adjectives were Sad, iorcefu], dead, cainest, jjlayful, tianqiiil, 
lively, cruel, joyous, quiet, gra\c, gciirie, lazv, fiei), fuiious, jolly, 
hard, agitated, angi) Lunt, JcJicate, idle, soiiowlul, and stiong The 
results were scored to show the conloimity oi an iiKlividuiil to the 
tendencies of a group oI J'M studcntN, 55 of ixlioin weic studying 
design. For instance, tlie ^\oid idle was repiescntcd as iollows 


Form 


Dll eci 1071 


Line 


angular 

9 

hoi i/oniai 

110 

he n y 

4 

curve 

57 

upwaid 

9 

medniin 

56 

wave 

28 

doi^ n 

25 

light 

84 

straight 

50 





This tabulation shows that most students dicw a light horizontal, 
curved or straight line lot the meaiiing ol the woid idle Students 
who made such lines weic given moie cicdit than otheis Wider 
deviations were given lesb ciedit than ^nailer deMations 

The scores of two gioups, each containing hit) students in design, 
showed split-half reliabilities ol appioximatel) fin The tests sroies 
showed correlations of 81 and 83 with teaclieis* estimates ol oiig- 
inality and fertility in design, when both the le^ts and c^iiniates w'eic 
corrected for attenuations, but onl) 58 and .(jj, when uncoiiccted. 
The correction for attenmition thcoicticall> removes chance eirors 
of measurement (Chaptci Xlll) 'Ihe actual picdktive value of the 
test, however, is shown bv uiicorrected conelations Ihe unconectccl 
correlations between tc^l scores and art grades were *18 and 58, 
which are fairly high (or this aoit ol lest. '1 he ai t test scoi es con elated 
.18 with the United Statei» Army Alpha tor sixt)-nine libeial-arts 
students, and .08 with the Amen can Council on Education College 
Psychological Examination foi freshmen among foity-eight women 
students. Apparently the hne-diawing test in\olves abilities not 
called for in verbal mental tests 

Walton (1936) used both the Guilfoicl and the PofTenberger and 
Barrows techniques Tests of matching hues with woids coiiclated 
with giving words to icpiesent lines in the neighboihood ol jl Self- 
correlations betw'een two trials ol cither test langed between 26 and 
.70 for small groups ol children The conelations betw'een the test 
of drawing lines to represent meanings of words and othci art tests 
were all approximately 31 Grades in school, intelligcnce-tcst scoics, 
tests of discrimination oi coloi hariiion), and sex appeared to be 
unrelated to tests of matching lines and word meanings. 

Responses to Color 

Aesthetic responses to single colors have been studied with the 
Jiope of finding basic or marked tendencies ol feeling or association. 
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The evidence seems to show that when te^dture and brightness are 
eliminated, the response to color is largely a learned one. 

Early experiments are well illustrated by Stefanescu-Goanga 
(1911), who asked six subjects to give introspective reports of the 
effect of each of a series of colors. The number of times each color was 
found to be exciting, soothing, pleasing, and displeasing was re- 
ported. In general, red, orange, and yellow were stirring, warm and 
cheerful, green was quieting, blue, indigo, and violet were depress- 
ing, serious, cold, and sad, purple was exciting and dignified. The 
feelings were thought to be aroused because of associations of specific 
sorts. 

Bullough (1910) classified responses to colors into four groups: 

1. Objective* such as judgments of hue, saturation, and luminosity 

2. Physiological responses: one's awareness of bodily feelings, such as ex- 
citement, weight, and warmth 

3. Associations of specific sorts* moon, water, fire, and medicine 

4. Character • i.e , imputing to a color a particular character or set of per- 
sonal traits 

He reported that in most of his observers one class of response 
predominated and that each type of response was usually incom- 
patible with the others. The objective responses seemed typically 
intellectual, the physiological, merely adjustive, and the associative, 
variable or intellectual. The “character" responses were considered 
to be the most aesthetic because they “objectified emotion." 

The findings of Bullough were not supported by the monumental 
work of Van Allesh (1925), which reported 20 years of observations. 
He found that tints and shades of all colors, when background and 
suggestion were avoided, evoked no typical responses, and that all 
observers were inconsistent in some manner. He found so little con- 
sistency in introspective responses to short exposures (7 seconds) that 
he avoided general conclusions. The most frequent effects reported 
were activity and passivity, cheerfulness, sadness, earnestness, loud- 
ness, softness, friendliness, hostility, warmth, and coolness He also 
described the appearance of color as a film, a surface, or a space. He 
sought to have his observers see only film colors, because the other 
aspects seemed to arouse additional associations 

He also found disagreements and self-contradictions in preferences 
for pairs of colors. The responses to color pairs added ideas of har- 
mony, based somewhat on associations with single colors Thus, a 
dull brown and a bright orange seemed to represent to one observer 
seventy and pride which did not belong together. Verbal suggestions 
of solemnity, grief, friendliness, joy, poverty, gave color associations 
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which were no! othciwjse made Every color and color pan could 
be made, bv suggcsLion^, to seem commonplace, and nearlv all colors 
could be associated with both the tragic and comic Color pans were 
as susceptible to suggestions as single colors 

Children’s responses to color have been studied by about uventy 
investigators using vai lous materials and tcchnupies The mateiials 
were usually coloied papcis which were not, oL com sc pure and 
which had a dehnitc lextuie as well as a color, Gcneiali/ations Ironi 
such materials should not he made freely, but the following conclu- 
sions aic lairly well agreed upon 

1. Inlants seventy days of age disci iininate between led orange, green, 
and blue green when brightness diflereiices aic eliminated (C^iase, 19*^7) 

2 Foul (olois aie dearly ddleientiated by the age of fifieeii months 
(Staples, 1932) 

3 Children between eight and twcnl)-four months are most ellectively 
stimulated b) red, next yellow, blue and green in older (Holden and Bosse, 
1900). 

4. Groups of four- and fi\c-}e.ii-olds stated that orange was the fa\ontc 
color, pink, second, and red, thud 

Williams (1933) and Walton (1933) found that high scores in sensi- 
tivity to coloi harmony, as showm by the selection ol a colored scarf 
to go w’rfh a colored chess on a doll, were found as carls as the fouitli 
year of age, but that age-group averages clrd not exceed chance until 
the eightli )ear The color test scores indicated the amounts of agree- 
ment with judgments of experts. There was a gradual increase in 
scores to the twelfth yeai College students did much better than 
twelve-year-olds Intelligence-tcsi scores, tcacheis* ratings of artistic 
ability, and scores from air teats baaed on form diacimrination were 
not closely related to scores on this test of color -harmony sensitivity. 

Discussion of Tests 

From the descriptions of tests of artistic appreciation just given, 
it appears that 

1. Total scores on group tests of 100 items oi iiioic give fairly 
stable group means. 

2. Total scores are usually not reliable enough for appraisal of 
individuals, except when mature students of ait have been tested 
on discrimination of patterns 

3 Low correlations between \arious subtests in art-appreciation 
batteries indicate the probable existence of several independent 
patterns of response Ihis situation results in ambiguity of intcipre- 
tation of scores, for a given score can seldom be said to reier to a par- 
ticular pattern of response. 
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These conclusions point to the need of more analytical appraisals 
which would show a person his profile of independent skills. A very 
fascinating field of research lies in this direction. From a survey of 
existing work, it seems probable that the following will eventually 
appear to be among the independent aspects of picture apprecia- 
tion. 

1. Discrimination of moods of isolated lines 

2. Discrimination of coherence of (1) lines, (2) areas, and (3) 3- 
dimensional spaces 

3. Discrimination of moods of colors 

4. Discrimination of minute variations of theme 

5. Other specific associations of personal memories, desires, prej- 
udices or fears, as shown in part by the Rorschach technique 
(Chapter XXIII). 

6. Feelings of satisfaction associated with the activities listed im- 
mediately above 

With this tentative analysis of aspects of appreciation, let us turn to 
the consideration of activities involved in artistic composition. 

THE NATURE OF ARTISTIC COMPOSITION 

The psychological processes actually used in the creation of a 
design have generally been reported only in fragmentary anecdotal 
fashion. However, three studies which attempted a systematic observa- 
tion of artists or others at work have appeared. Grippen (1933) 
found that creative artistic imagination was rarely demonstrated 
below the age of five, as shown by spontaneous drawings. By watch- 
ing artistically talented six-year-olds, he discerned the following 
seven types of imaginative development. 

L Revision of a single memory image 

2. Organization on the nature of a composite from several images, usually 
related 

3 Improvisation of a theme, resembling the source or sources, from a 
number of images 

4. Selection of various elements of aesthetic interest, to which other ele- 
ments may be added, all based upon a single memorial or sensory ex- 
perience 

5. Compositional expressions arising as a reaction from a single memory 
touching upon some more or less strong emotional experience 

6 Effective expressions appearing m appropriate compositional setting 
from a single vivid aspect of a larger experience residing in the child 
as a memorial experience 

7. Fusion of compositional elements or aspects into a composition of 
high character, from a continuing experience over a limited tune in- 
terval 
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A compaiison of sixty-sc\en dia'vviiigs of tdlentecl seventy-nine 
drawing's of nontalcntcd children b) Giippen (airaiigcd from p. 
80) showed Uige diflereiiccs as iolloi\s 


Piofiojttofi of Diawirt^s 
Exhiljiting com posi Lion skill 
Numbci ()1 \ Cl bill seif-cr it k isms 
Mean niimbei oi relevant \erbal comments 
Mean time per drawing 
Mean number ol colois used 


Talented Non talented 


90 pel cent > per cent 
77 4 

Cl *5 72 

15 miTi 3 min 

3 plus 1 plus 


These figuics nidicate much gicater intciest and aciiviiy, both 
mental and physical, and gieatci pcise\eiancc on the pau of the 
talented childicri The relative impoitance oi motivation and abil- 
ity cannot he evaluated liom this study 

Patnek (1937) lecoidecl the \eibal expressions of 50 artists and 50 
unpracticed sketchers who agieed to talk about their woik as they 
worked He concluded that both groups levealcd the same lour 
stages, which w'eie also iotind in poetic cieation. piepaiation, in- 
cubation, illumination, and veiificatioii 

Lark-FIoiovit/ (1936) studied the drawings ol untrained adults, 
seveiiL)-loin men and ninety-six women, wdiose avei age age w^as 
about twenu-livc. The following objects were drawn Iroin memorj 
violin chan, church, duck, horse, man, woman, child, IJow'ei, auto- 
mobile, and a country road She lollow’cd Kerschenstemer’s (1905) 
analysis of three general stages of development scheniaric representa- 
tion tvpical oi childicn, piescntation true to appearance m two 
dimensions, and perspective cliaw'ing 

These untrained adults made drawings much like children from 
SIX to ten years of age, most of them wcic of the schematic i\pe A 
questionnaire showed that most of them had tiied to reconstiuct the 
pictuie fiom memory on some logical basis but were unable to dis- 
entangle various sciieinata Manual ability was thought to be of small 
impoitancc, but failure to grow in graphic piescntation seemed due 
to lack of training in visual discrimination and synthesis. 

To sonic extent the processes in artistic composition aie also in- 
dicated by studiejj oi development ol drawing ability. 


Development of Drawing Ability 

An examination of the exteiisnc litcratuie on children’s chawing ^ 
shows a widespread belief in the existence of several stages of devel- 
opment. There is fairly general agreement that the preliminary stage 

1 Ayer (1916), Goodenough p928), Tomlinson (193-1), Meier (1933), and Arustasi 
andFole> (1936) 
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is one in which scribbling raarks are made* with no attempt at rep- 
resentation. Occasional similarities between objects and drawings 
may be noticed by the child. In the next stage important details of 
a person or house may be drawn as they are remembered, with only 
slight success in putting them together. Thus, Illustration 31 shows 
a man whose hands and legs are attached to his head in the absence of 
any body. This stage is usually one where motion or use is depicted. 
In a later stage the idea of the whole tends to dominate, so that the 
details have their proper relative size and position. Correct perspec- 
tive is a still later technique. In older children more attention is 
given to style and fineness of detail and composition. A few specific 
studies state the ages of development for certain types of discrimina- 
tion and compositions 

Daniels (1933) found that preschool children, ages two to five 
years, showed marked preferences for balanced block designs, but 
that preference for balanced designs was not correlated either with 
ability to reproduce the design with blocks or with Stanford-Binet 
scores. Whorley (1933) found that unified compositions of toy trees 
and of furniture on standard backgrounds were rarely made before 
the age of four. Such unity increased gradually to the age of ten. The 
test scores involving outdoor arrangements correlated with scores of 
indoor arrangements between .29 and .45 for various small groups. 
The investigator felt that ‘‘fitness*' may have been a more important 
factor in the indoor model than in the outdoor model. 

Saunders (1936) found that two years of intensive training showed 
radical changes in art ability among children in the first four grades. 
The improvement bore a direct relation to amount of instruction 
and to initial degree of artistic inferiority Unfavorable home condi- 
tions, use of improper materials, lack of motivation, and lack of 
sensitivity to elements of artistic quality were found to be important 
deterrents to the development of drawing ability. 

These reports all agree that artistic composition in the field of de- 
sign IS very much improved by familiarity with types of composition 
and their fine points. Interest, amount of mental activity, persever- 
ance, and length of special training were very effective in improving 
artistic compositions. 

Tests of Composition and Representation 

The ability to compose an artistic work is not easy to distinguish 
from the ability to draw mere representations of objects. 

Since there are no sharp lines between an artistic picture and one 
not so artistic, scales of values have been established on the basis of 
agreement of judges by Thorndike (1916), McCarty (1924), Kline and 
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Carey (1922 and 1923) ; and Tiebout (1933) The Thorndike scale 
is a series of thirty-four drawings, each of difEerent subjects, an 
ranged in order of excellence by sixty artists and sixty art teachers. 
McCarty's scale consists of three series of thirty-four drawings each, 
one of persons, one of horses, and one of landscapes. Each series is 
divided into nine steps in a scale based on ratings of sixty judges. 
The drawings were collected from children, ages four to eight years. 
The two Kline-Carey scales are intended to measure two different 
aspects of composition- representation and design. The first consists 
of fourteen graded samples of each of the following house, tree, rab- 
bit, and a figure in action drawn to illustrate a short story. The 
second includes ten or eleven graded samples of each of the follow- 
ing, illustration, poster, border, and structural design Each sample 
is given a numerical value based on order-of-merit selections of 152 
judges who were persons with considerable art training. Tiebout 
(1936) introduced color, and followed the plan of having children 
illustiate short stories oi pails theicol. Paintings were made inde- 
pendently b) each child using tempeia paints undci standard condi- 
tions One hundred childien in each giade from the fiist to the 
seventh inclusi\c contiibuted thiec drawings each Four judges made 
jneliniinai) selections ot thirty paintings lor each grade and ranked 
these in five piles according to artistic quality Fourteen experts then 
airanged these selected paintings. They weie advised to give con- 
sideration to “attainment ol rhythm, balance, unity, and othei 
aesthetic qualiiies, lather than to tcchnicjue oi lealistic representa- 
tion ” Tn the establishment of aitistic \aliie the opinion of a lew 
well-qualified judges was considered to be more valuable than that 
of a larger numbei ol less experienced judges. In its final loiin the 
scale has fiom eleven lo sixteen paintings for each gi ade. The judges’ 
agreement upon the relative value ol the scaled pictures was high, 
the conelations weie 94 and .93 between average ratings of hall ol 
the judges with the other hall. Two tiaincd workers using these 
scales lor classifying original drawings show^ed correlations ol ap- 
j^roximately 79 and 73, vMth the average of two other trained work- 
eis judging one hundred paintings in each grade. No figures were 
given for consistency of individual judges 

A novel method ol appraising compositional ability, which was 
designed to be free from technical skill, was described by McCloy 
(1939) The materials used were a control board which vaiied the 
intensity and color of a screen on a small stage, a number of small 
clay statues and objects, and twenty-five landscapes wdiich could be 
used as backgrounds Aftci becoming familiar with the materials, 
subjects were asked to arrange sets of objects “until you get the effect 



320 ACHIEVEMENT AND APTITUDE 

you like best.” Colored photographs of the preferred arrangements 
were appraised by three selected judges according to a uniform 
scheme in which each of the following items was rated on a ten-point 
scale: 

1. Arrangement and use of light and shade 

2 Arrangement of figures 

3. Background selection 

4. Originality 

a. Compositional arrangement 

h. Emotional interpretation 

5. Color 

a. Harmony 

h. Appropriateness in background 

After testing twenty subjects of various ages and with various 
amounts of training, McCloy reached the conclusion that creative 
ability under these conditions bore no relationship to age after twelve 
years, to amount of time used, or to previous artistic training At one 
extreme the various subjects exhibited arrangements which seemed 
entirely accidental, and at the other, arrangements which followed 
definite ideals formed after first seeing the clay forms. 

A scale of drawing was constructed by Goodenough (1928) for the 
purpose of indicating intelligence rather than artistic merit (see 
Chapter V). An attempt was made entirely to disregard artistic merit. 
Nearly four thousand drawings of a man by children of various age 
groups were inspected to find what changes took place between suc- 
cessive ages in accuracy of representation. She distinguished eight steps 
of psychological growth, which are much like those listed by Grippen 
(1933) for imaginative development: 

1. Seeing a resemblance between pictures and objects 

2. Noting the parts to be drawn 

3. Selecting the most essential parts 

4. Noting relative position 

5. Noting relative proportions 

6. Representing parts with simplified outlines 

7. Coordination of hand and eye in drawing 

8. The addition of new features as the concept develops 

Goodenough's scale consists of verbal descriptions of fifty-two 
attributes of a drawing of man, each of which is given one point. 
Forty illustrative drawings are also printed (see Illus. 29). Total 
points may be changed into a mental age. The points include the 
presence of parts, their attachment, proportion, and details. Motor 
coordination of the child is scored by the absence of unintentional 
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irregularities. Perspective and figures drawn in profile receive extra 
credit. This method o£ scoring drawings undoubtedly gives credit to 
some unartistic details, but qualities of correct proportion, unity, 
and fitness, stressed by nearly all the analyses of art, are also given 
credit. This work shows a fairly clear overlapping of factors in scales 
of artistic and representative ability in drawing. It would be inter- 
esting to have the same drawings ranked for both values, in order to 
see their closest relationship No study of this sort has come to hand. 

Horn Art Aptitude Test. Horn and Smith (1945) report an Art 
Aptitude Inventory developed by the staff of the School of Applied 
Art of the Rochester Institute of Technology over a period of 8 
years The student is required to make drawings to illustrate his 
quality of line or shading, compositional sense, fertility of imagina- 
tion, and use of abstract or natural forms. The test consists of the 
following three parts: 

1. Scribble Exercise The student is asked to draw twenty different items, 
for example, a book and a fork, in limited times varying from 2 to 6 sec- 
onds. The whole exercise lakes about 5 minutes This part is designed to 
give the student conficleiuc and to diow clanly ol ihouglit and piescntation, 
and orderly ariangcnient on the entire page 

2 Doodle ExcHise The student is asked 10 diaw various lines and 
shapes wilhin pi escribed aieas This part shows ab^tiact com posit ion abdity 
and originality 

3 Imageiy, The student is asked to ronstiiui sketches which arc sug- 
gested by k<‘\ lines wliicli aie already drawn in nvehc rectangles, each 
2% by 3% inches This section shows (eitility of imagination, scope ol inter- 
ests, shading, and st)lc 

The scoiing is accomplished by compaiing a conipleted test with 
samples given in the manual ol directions and with samples ol woik 
by students \ person without any naming can score the extieiiics 
with great acctiiacy, but the inicldle langes can be distinguished 
only by iaiily well-trained examiners All paits of the test are scored 
for 

fl. Clarity of thovght Are sketclics clean and recognizable or were 
there fumbling ciasurcs, and meaningless detaiP 

b Quality oj line -\ie lines smooth, giaccful, and accurate, or broken, 
cramped, bumpy, and lu/zy^ 

c. Coloi Fs iheic even intensity 01 spotty une\cn pressure, is there good 
use of shading^ 

The conelation between scores assigned to the same by dif- 

ferent scoreis was about 85 lor small highly selected groups This 
agreement is considcicd to he ^ery satislactory in the light ot the 
fact that the group was restricted. 
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The validity o£ the test is indicated by a correlation of .53 be- 
tween the Art Aptitude Test given in the freshman year and grades 
or over-all ratings of success at the end of a 3-year ait course among 
fifty-two seniors. The ACE Psychological Test correlated only .28 
with success in the 3-year course. These correlations would be some- 
what higher if the scores of those who failed to finish the course were 
considered. The Art Aptitude Test correlated .15 with the ACE 
Psychological Examination. 

Intelligence and Art Tests 

The relationship between intelligence and artistic ability is not 
clearly defined or measured by published studies ^ Definitions of 
both types of ability are characterized by vagueness and lack of wide 
acceptance. Tests designed to measure these abilities are fairly re- 
liable, but as yet have failed to give careful analyses of the patterns 
themselves. Tests of artistic composition usually fail to distinguish 
between creative and copying processes, and tests of intellect tail to 
distinguish among such abilities as rote memory, perception, and 
reasoning. Furthermore, in research to determine the relationship 
of artistic and intellectual abilities the groups of persons studied 
have usually been small and rather narrowly selected. The correla- 
tions quoted, therefore, are not readily comparable, and differences 
which are reported do not necessarily indicate differences in essen- 
tial facts but perhaps merely insufficient samplings, A few typical re- 
sults are given. 

Goodenough (1931) reported a correlation of .74 between her 
drawing scale of intelligence and the Stanford-Binet IQ scores, using 
334 children between the ages of three and eleven years Slightly 
lower correlations were found for tliese same children when smaller 
single-age groups were used. Bird (1932) and Tiebout and Meier 
(1936) found correlations of approximately .43 between IQ’s from 
the Goodenough Draw-a-Man Test and IQ*s from two group tests, 
the Kuhlmann-Anderson and the Dearborn, using one hundred or 
more pupils per grade 

These same investigators found that correlations between the 
Draw-a-Man Test IQ's and Tiebout's score for artistic drawing were 
in the neighborhood of 35 for three groups of one hundred children 
each, in the first three grades. In the fourth grade the correlation 
dropped to .18. Bird's (1932) score for drawing correlated 49 with 
Goodenough IQ's in a group of 248, six to nine years old. Manuel 
and Hughes (1932) report correlations still higher (.63 to .86) be- 
2 Summarized by Bird (1930) and Tiebout and Meier (1936). 
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Lwcen Goodenough IQ’s and their lalings of quality of diawings in 
836 Mexican children in grades one to sl\ 

Tiebouf and \reier (1930) found that coirelarion between meas- 
ures ol aiLisiic composition and \eibal tests of intellect are usually 
higher in the Brst giade than m latci grades Tlie Kuhlman-Anderson 
IQ’s and Tiebout scoies showed coirclations oi 36 or 10 in giades 
one, two, and rhiee, and nearl) /eio coiielatioiis in the highei giades 
No significant correlations of this sort have appeared for adult 
groups High school pupils considered to be artistically supeiior were 
only slightly abo\e the aveiage Ktihlmaii-Andeison IQ’s Fifty aitists 
selected as the most outstanding among 5,600 names listed in the 
Biog'iophy of American Aitists made an average IQ ol 118 on the 
Otis Seli-Adininisteiing Test, and showed then laigest numbers of 
ciiois in handling number concepts Intelligence test scoies and 
rating as an ai tist showed a zero coiielation in this groiqa 

lliese figures show^ that persons who produce aitistic w'oiks are 
slightly superior in verbal intelligence test scores but that artistic 
ability IS not dependent upon such intellectual capacity 

ANALYTICAL STUDIES OF ARTISTIC ABILITY 

Four studies should be mentioned because they attempt to ap- 
praise skills which are thought to be component elements of aitistic 
composition. Kiiauber and Pressey (1927) included separate meas- 
ures ol skills, w’hich wTre taken liom drawings oi completions of 
drawings Their study cmbiaccd the following eight abilities mem- 
ory for designs (long and short time), observation, accuracy, imagina- 
tion, creative imagination, analyzing ability, ability to visualize, and 
design sensitivity 

Lew’erenz (1927) prepared nine tests designed to measure basic art 
abilities. In these tests, which made frequent use ol multiple-choice 
and completion items, he included the following piocedures 

1 Preferences for design Choose beivveen lour vaiiatioiis of one ihcinc 
(14 Items ol incrcMsing conijdexitv) 

2 Originality of line drawing Draw lines betvseen printed dots to make 
a picture (10 items) 

3 Indicate omission ol sh.idovxs in ten drawings 

4 Vocabulaiy of materials, processes, drawing terms, and pictures (50 
pairs of v\ ords). 

5 Tinmediate memory span Reproduce part of a picture of a vase from 
memory 

6, 7, and 8 Indicate errors in pictuics ol cylindiical, parallel, and an- 
gnlai perspectives 
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9. Color-matching test* Six key colors are to be fiiatched with 46 variations 
in hue and shade. 

Both of these batteries stress accuracy of observation and memory 
for details of proportion, shading, and perspective, as well as prefer- 
ences for designs and pictures. This sort of battery affords a means of 
comparing the relationship between the various subtests and eventu- 
ally showing by a factorial analysis the basic patterns in artistic draw- 
ing Both of tliese batteries have subtests which are probably too 
short for the analysis of factors or the construction of individual 
profiles. The method is sound, however, and a fascinating field of 
research is open A correlation between total scores of these two bat- 
teries of .64 was found in a group of sixty-four art students. Correla- 
tion between the total Lewerenz test scores and scores on the Mc- 
Adory and the Meier-Seashore tests were approximately .53, Simple 
correlations of this sort give no clear basis of analysis, but they sug- 
gest that the tests are measuring either the same processes or related 
processes to a marked extent. 

Tiebout (1933) and Dreps (1933) compared scores on various tests 
of motor coordination, observation, discrimination, and memory 
with ratings of artistic ability They found that the average scores of 
small groups of children and adults who were rated as artistically 
superior exceeded significantly the average scores of similar groups 
rated as inferior in the following tests 

1. Completeness and accuracy of visual observation (Heilbronner and 
Lewerenz tests) 

2. Recall of observed material after ten days or six-month intervals 
(Fernald) 

3. Uniqueness of interpretation of ink blots (Knox) 

4. Originality of line drawing (Lewerenz) 

5 Form discrimination 

6 Feature discrimination (Greene) 

7. IQ's 

8 Aesthetic judgment 

Small differences of doubtful significance were found on tests of: 

1 Recognition memory, immediate 

2. Completion of a drawing from memory 

3. Visual imagery (Griffitts) 

4. Neurotic tendencies (Pressey X-O) 

No significant differences were found in tests of: 

I, Hand-and-eye coordination (Greene, Whipple) 

2 Steadiness of movement (Wellman, Whipple) 

3, Color matching (Lewerenz) 
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Carroll (1032) found «mal] and uiiieliable coirclations beiucen 
ability to apprcucUe or cicate art and introversion and emotional 
instability Ai lists as a gioup shoivcd no more instability than a group 
of students ot s tat is tics 

Meier (1930) dnected an elaboiatc 10-year study of artistic ability 
which included rescaich by Uvciit\ co^voikcis as well as linnsclf He 
described six patterns which he believed to be imporLant in graphic 
arts: 

1 Manual shtU 7’his abilil\ is regarded as fine liand-and-ese cooidina- 
tion which can be rioicd at cail) ages 

2 Ene 7 gy oull)i/t 'Ihis is shrmii by unusual comcntraiion on a task 
for long pei locls 

3. Intelligence The usimI IQ-tc^l scores arc abo\e ascrage, with more 
success, ho'vsi'vc'i, in pans ol the tcsi which ha\c to do with Msuali/ing and 
speed of peireiving than in the parts that ha\e to do with nunihci and 
technical vocabularies 

4 Peiceptua! jatiliiy By this is meant the abihly to observe and recall 
sensory expel lences 

5 Cyeative imagination This is defined as an ability to oigam/c 
sense impresMons inio a ‘work having some ch‘gicc ol acsihetit chaiactei " 

6 Aesthetic judgment Tins is considered to he the most important 
factor of artistic conipeieiue It la defined as ability to iccognirc unity of 
composition, and it is measured by the Mcicr-Seashore Ait- Judgment lest 

These six items weic not considered to he niiiiually exclnsise but 
were geneitil terms desciiptivc ol complex and inteii elated patterns 
Meier believed that the fiist three factors are pnmaiily inherited 
through a line ol ancestors, and that the last three aie definitely lim- 
ited by inhcriiancc He pointed out that luture analyses of artistic 
ability will probably indicate that elemental hinctjons similar to those 
described by Ihuistone (1938) imdeihc these six pattcins To the 
writer, the first loui patterns seem to be impoitaiit in any sort of 
craft or occupation where spatial factors are important. Ihe last 
two factors seem to be found piiiicipally in artists, and hence to 
distinguish them fiom other people 

Any discussion of artistic ability would be incomplete without 
mention oi the analyses of motives which dri\c artists to their work 
Such analysis has been attempted by psychoanalysts, and their work 
is fruitful and challenging (Chaptei XX f). 

STUDY GUIDE QUESTIONS 

L What are the characteristirs of measures of special aptitudes^ 

2. What visual fum lions are usually measured and how arc they meas- 
ured? 
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3. What are the principal variations in hearing? How are they measured? 

4. What aspects of musical discrimination are measured by the Seashore 
Measures of Musical Talent? 

5 What additional aspects are measured by Kwalwasser and by Bachern? 

6. What are the usual relationships between various measures of musical 
talent? 

7. What correlations are usually found between tests of intelligence and 
of musical abilities? 

8. How are tests of preferences for pictures related to art aptitude? 

9. What aspects of drawings are altered in the Meier-Seashore Test? 
The Graves Design-Judgment Test? The Horn Art Aptitude Test? 

10 How can tests of composition be made more reliable? 

1 1 What are the usual correlations between artistic skills? 

12. To what extent are intelligence-test scores related to artistic ability? 



CHAPTER XI 


MILITARY DEVELOPMENT 
OF TESTS AND RATINGS 




In this chapter some of the important tests and recently developed 
ratings used by the military authorities will be described briefly The 
Army, Navy, and Air Corps each use screening tests for all new re- 
cruits, and the more analytical tests in selecting men and women 
for special training and duties. Personality and interest inventories 
are also used by the nnliiary authoiities on significant samples. The 
criterion of success usually available is completion of a particular 
course of ii aining, and unusually good jnedictions aie show ii m many 
situations A lew studies give Iragmeiitaiy reports on the relations 
betw^een various tests and perlormance oL military duties The Aiiny 
Alpha and Beta Tests ot World War I have been described in Chapter 
VIII. 


SCOPE OF MILITARY TESTS 

During World War IT the military authorities in the United States 
called upon the biological and social scientists to help solve many 
of the urgent problems ol selecting, training, leading, and, when 
necessary, ol healing members ot the Armed Forces Scientific activities 
in developing and using measures of behavior were usually adjusted 
to specific military requirements While these lequiiemcius were 
similar in many respects to those of civilian schools, industries, and 
clinics, still they differed in that they (fl) were often concerned w'lth 
large screening operations, (b) had as criteria success in occupations 
not found in civilian lile, for example, fighter piloting, and combat- 
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troop activities of all kinds, and (c) were iiccomplished under ter- 
rific time-pressure. Also, there were (d) large reserves of available 
man power and {e) authority to enforce service by severe penalties. 

In 1941 all the large militai'y branches decided to set up coordi- 
nated selection and research programs in various locations on various 
problems, rather than a single strongly centralized program This 
decision was made because of (1) the initial and continued difficulties 
with communication between widely separated units, (2) the need 
to have a very close practical association between particular opera- 
tions and selection, training, and research; and (3) the great urgency 
for the immediate selection of great numbers of men by the best 
method locally available. As the war progressed the common prob- 
lems of the various services were sifted through the Committee on 
Service Personnel, Applied Psychology Panel, and there was con- 
siderable exchange of specific forms and information Several proj- 
ects were planned for the use of both Army and Navy, Bray (1948). 

Three types of measures were used in the selection of military per- 
sonnel, namely, ability, interest, and adjustment. The ability tests 
were more adequately developed, and were relied upon much more 
than the measures of interest and of adjustment. In mental clinics 
and hospitals, however, measures of adjustment were widely used to 
aid in remedial procedures. 

MILITARY TEST BATTERIES 
The United States Army Tests 

The United States Army Classification System provided standard 
mental tests for nearly every important problem of personnel selec- 
tion. The steps taken were about as follows 

1. At the induction center before a man was accepted for military 
service he was briefly interviewed for gross defects and literacy. 
If his literacy was in doubt he was given a simple test to deter- 
mine whether or not he could read at about the level of the 
fourth grade. If this test was failed, often a Visual Classifica- 
tion Test was given to discover if he had the ability to under- 
stand and follow directions. 

2. The enlisted man was then sent to one of thirty reception cen- 
ters, where all literates were given the Army General Classifica- 
tion Test (AGCT), a general mechanical test, and the Radio 
Telegraph Operator’s Aptitude Test The last two tests were 
included because the Army could not expect to recruit enough 
trained men in these fields. A Qualification Card containing 
such Items as education, languages, highest vocational skills. 
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job hislor^, hobbies, Icadeiship, experiences, and pievious mili- 
taiy iraining, was also compleied by iiuenicw. Test :»coies 
were recorded on this card. If a man possessed skills that were 
needed at once, he was sent diiectly to the unit invohed Othei 
men (about 50 per cent) were sent to Replacement 'rraining 
Centers 

3 At the Replacement Tiaining Centcis individual general men- 
tal tests were given to those whose group-test scores seemed to 
yield msuflicieni data The processes ol screening men lor all 
specialist and olhcci schools, lor special training units, and for 
treatment when they v\cie emotionally and mentally inade- 
cpiate invohed ihc use of tests such as tlic following. 

Classification lasts 

Gcru'ial (dassificalLon Test 
Noiilanguage Test 
Visual Classification Test 
Higher Fvainination 
Oflucr Candidate Test 

W’omen’s Classification Test (mental alertness test) 

Aimy Inlorniation Sheet (minimum literacy test) 

Aptitude Tests 

Mec ha meal Aptitude 1 est 

Cleiical Aptitude lest 

Radio rdc'giaph Opcratoi's Aptitude Test 

Code Learning Test 

Batteiv ol Tc:>ts for Combat Intelligence 
Identification ol Aerial Photogiaphs 
^^aJ^ Identification 
Route Tracing 
Battle Mail-* 

Pcneptiou of Detail 
Map Reading 
Map Orientation 

Lducalional A( hievcment Exammalion, Army Specialized framing Pro- 
gram (z\SlP) 

Algebra 

Arithmetic 

Fnglish Grammar and Composition 
Iicnch 

General History 
Gcirnan 

Inorganic Chemistry 
Physics 

Plane and Solid Geometry 
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Spanish 
Trigonometry 
United States History 

Combined Algebra, Trigonometry, and Geometry 

Trade Knowledge Tests 
General Automotive Information Test 
General Electricity and Radio Information Test 
General Electrical Information Test 
General Radio Information Test 
Driver and Automotive Information Test 

Warrant Officer Examinations 
About thirty technical examinations in various fields. 

The Aimy General Classification Test (AGCT), in four equivalent 
forms, was administered to more than 9 million persons during 
World War II It was designed to be a test of learning ability for 
literate adults. The first form of the AGCT (Form la) was released 
October 1940, and the last (Form Id) m October 1941. Forms la and 
lb each contained 150 items and were preceded by a separate prac- 
tice booklet. Forms Ic and Id each included 10 practice items, and 
140 test Items. All forms rotated three types of items in this order of 
presentation* vocabulary, arithmetic, and block counting The time 
limit was 40 minutes, and the score was the number right minus one 
third the number wrong, because all items had four choices. In 
selecting items the Committee on Classification of Military Personnel 
agreed to emphasize the following points of view (Personnel Research 
Section, 1945)* 

1. The tests should include both verbal and nonverbal items. 

2. Assuming that modern warfare is rapidly becoming more 
technical, emphasis was to be placed upon items calling for spatial 
thinking and for quantitative reasoning. 

3. It was planned to keep at a minimum items greatly influenced 
by amount of schooling and by cultural inequalities generally. To 
this end the use of information items was not planned. 

4. Insofar as possible, the time or speed element was to be mini- 
mized. (This aspect was ignored in later practical situations ) 

6. The General Classification Test was not to serve the purpose of 
trade tests. 

6. It was specifically recognized that the test was not to measure 
personality traits. (However, it was recognized that emotional stress 
might seriously affect one's score ) 

7. The test should appeal to the average officer and soldier as 
sensible 
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About five thousand items were tried out on samples oi enlisted 
men to detennine diflicult)- as well as to find if they met the con- 
sidcidtions listed aI>o\e Tentative noims were Liier set Lor total 
scoies, and ioi easy iiiteipretation ot all forms oL the lest the mean 
T scoie was set at 100 and the standard deviation at 20. Table 5 shows 
the Roman numerals used lor army grades and the corresponding 
distiibuLion figures 

"Ihe reliabilities, which have been computed many times on var- 
ious samples, were approximately 8-1 loi retests at various inieivals 
of months, .90 loi alternate loims, 90 for odd-even or Kiider-Rich- 
ardson methods There w’cic slightly negative coiTclations with age, 
— 20 to — 33 for a giouj) ol ofhceis whose mean age w’as about thiity- 
tw'o yeais This was partly due to the fact that the AGCl^ was closely 
timed (40 minutes) and that «pced wms an important lactor In a 
group ol 4,330 enlisted men con elation with age was 02 

ILLUS, 112 GR\DE DISTRIBUTION OF MIN PROCESSED 'IHROUGH 
Rl' t.l- PTION CEN riRS, 19 10-1 1 .\GCT 

ArmyCiade h(ore Limits Pei cenlage of Total Group 

1 no and above* 6 0 

II 110-129 26 5 

III 90-109 30 5 

TV 1)0-89 27 7 

V 59 and below 9 3 

1 otal nuiubcL ot cases . . . 8,293,879 

•Mean = 100 SD = 20 

The con elation bctw’een highest giade completed in school and 
the AGCT w’as api^roximately 70 The coiiclations between the 
AGCT and other tests of mental ability w^ere toiuicl to be in the 
neighboihood of 80, but laiiged fiom .65 for the ACE Psychological 
Examination to 90 lor the Army Alpha, Wells’ Revision, long form. 

The major usefulness ot the AGCl' was its value in selecting men 
for specialist training couiscs Illustration 113 shows a lew’ ol the 
several hundred validity coellicients that w’cre available These coi- 
relations are not directly comparable because the groups wTie pic- 
selected b) education or experience, oi by the AGCT itself For in- 
stance, a prerec|uisite lor officer candidate schools w'as an AGCT 
scoie ot 110 and loi the Army Specialized Training (.AST) Program 
a score ol 115 The correlations would, ol couise, have been much 
highei foi unselected groups In general, grades in clerical subjects, 
English and mathematics w'ere piedicted by the AGCT a little moie 
accuiately (.40) than grades for mechanics, radio, and motor trans- 
port, engineering, and foreign languages (.20 to .30) None ol these 
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correlations are concerned with evaluation^ of actual success on the 
job. 

In April 1945, the AGCT, Form 3a was issued. It has a total score 
similar to that of the earlier forms but allows for a profile of sub test 
scores in four areas: reading and vocabulary, arithmetic computa- 
tion, arithmetic reasoning, and pattern analysis. 


ILLUS 113 EXAMPLES OF VALIDITY COEFFICIENTS AGCT 


Population 

Critenon 

N 

Mean 

SD 

r 

Administrative Clerical Trainees, 

Grades 

2,947 

121 7 

11 1 

.40 

AAF 

Clerical Trainees, WAAC 

Grades 

199 

1168 

12 0 

62 

Airplane-Mechanic Trainees 

Grades 

99 

104 8 

10 6 

32 

Aircraft Armorer Tiainees 

Grades 

1,907 

117 3 

10 9 

40 

Radio Opeiator Sc Mechanic 

Glades 

1,055 

122.4 

11 1 

32 

Trainees, AAF 

Gunnery Trainees, Armored 

Grades 

66 

120 0 

121 

50 

Motor Transport Tiainees, WAAC 

Grades 

269 

1114 

13 6 

31 

Truckdriver Trainees 

Road-Test Ratings 

421 

95 5 

201 

13 

Weather-Observer Trainees, AAF 

Grades 

1,042 

130 2 

12 5 

43 

Officer Candidates, Infantry 

Grades, Academic 

103 

123 0 

10 8 

30 

Officer Candidates, Infantry 

Leadership Ratings 

201 

122 6 

10 8 

.12 

AST Trainees, Basic Engineeiing 

Grades, Inoiganic 
Chemistry 

222 

126 6 

7.8 

.21 

AST Trainees, Personnel Psychology 

Ranks in Tests and 
Measurements 

130 

134 0 

10 3 

29 

West Point Cadets, 4th Class 

Grades, English * 

932 

1313 

10 9 

40 

West Point Cadets, 4th Class 

Glades Mathematics 

* 932 

131 3 

10 9 

43 

West Point Cadets, 4th Class 

Glades, Military To- 
pography 

932 

1313 

10 9 

40 

West Point Cadets, 4th Class 

Grades Tactics 

932 

131 3 

10 9 

29 

W’est Point Cadets, 4th Class 

Grades, Spanish • 

932 

1313 

10 9 

.19 


• First term. 


(By permission of the editor of the Psychological Bulletin ) 

Occupational Norms, The widespread interest in the occupa- 
tional norms from World War I Alpha and Beta Tests led two au- 
thors, Harrell (1946) and Stewart (1947), to prepare norms for various 
civilian occupations from the World War II tests. These studies 
were not of officers, the subjects are enlisted men who claimed ex- 
perience in various civilian occupations 

Harrell (1946) reported norms for 774,383 men who were classified 
into 209 occupations in the continental Army Air Force in 1943 
Only those occupational groups with at least one hundred men were 
included. Stewart (1947) listed 220 occupational groups including 
technicians and skilled and semi-skilled workers. The sample was 
selected from a survey made in September 1944, of approximately 
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150,000 soldiers or 2 per cent of all United States Army personnel. 
Those whose serial numbers ended in 19 or in 75 were selected for a 
smaller random sample of 103,998. The number was finally reduced to 
68,325 men whose records were complete and who belonged to oc- 
cupations which included at least twenty-five men each. 

Illustration 114 gives a sample of the results in commonly found 
occupations These are typical both of recent reports and also of 
World War I results This table and other data that have been re- 
ported show that the spread of scores (variability) for the occupations 
with the highest mean scores was only about half the variability for 
the occupations with the lowest mean scores. For various reasons 
many above-average persons were found in relatively unskilled oc- 
cuiDations, but those below average were seldom found in the more 
skilled occupations This means that the test would show cut-off or 
critical scores for the jobs on the higher levels only One of the most 
significant findings, however, is that there is gi'eat overlapping be- 
tween occupations. This test can therefore be used only for the 
roughest kind of screening Since no validity correlations are avail- 
able, Jt should be used cautiously lor indiMdual counseling In the 
selection of woikeis, more analytical tests, experience, intciest, and 
special training t\ould be nioie indicative of success than the AGCT 
scores 

The Aimy Indundual Test of General Ability was piepared (1) 
to aid in deciding wliethei or not to discharge a man foi general in- 
aptitude and (2) to aid in clinical diagnosis I’o meet these needs a 
batters of tests was de\ eloped which («) covered the same abilities 
and lange ol abilities as the AGCl, (5) contained both \eibal and 
nonverbal mateiial, (c) could be given to aii) laci?! group, {d) could 
be admin istcicd by one nor trained in ps\chometry, and {e) w^ould 
require about an hour’s time and lew mateiials. 

Seventeen tests were tried out on 250 white and 215 coloied sol- 
diers — a]>proximately one hundred fioiii each of the five grades of 
the AGC [’ From coi relations between these tests (the AGCT and 
schooling) and from the specifications above, six tests wcic selected. 

1 Stoiy Menioiy A short paragraph is read to the examinee who is 
asked to repeat it and answer questions 

2 Snnilariiies-Differenff's 7’hc exanuiiec is aAcd to icll how pans of 
woids are alike and (hnoieni 

3 Digit-Span Three to ten digits are to be icpeated in the order given 
and three lo nine in reverse order 

4 Shoulfln Patches The e\anunoo is asked to duplicate a colored de- 
sign by selecting and placing pieces fioni among nineteen colored cut-out 
designs 
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5. Trail Making* The examinee is asked to draw a line from number to 
number in sequence, or from letter to letter. The score is the time used and 
tlie number of errors. 

6 Cube Assembly The examinee is asked to duplicate the arrangement 
of cubes shown in three pictures, by using twenty-four actual cubes on the 
table. Score is time needed. 

The whole test takes about 40 minutes and showed a reliability of 
.93 for one thousand white soldiers. The subtest median intercor- 
relation was 43, with the highest correlation, .61, between the first 
two tests, which emphasize verbal comprehension. The lowest cor- 
relation was .32 (between Digit-Span and Cube Assembly). 

In order to have the first three tests, which are verbal, contribute 
the same amount to the variance of the total score as the last three 
tests, which are nonverbal, the scores of three of the tests were 


ILLUS 114. CIVILIAN OCCUPATIONS AND AGCT SCORES 


Mean = 

Occupations 

= 100 

Number 

SD = 

Pia 

20 

P« 

Pn 

P 90 

0. 

Professional 

Accountant 

216 

114 

121 

129 

136 

143 

75 

Student, mechanical engineering 

62 

114 

122 

128 

136 

140 

6.5 

Student, medicine 

124 

116 

120 

127 

135 

140 

7.5 

Writer 

54 

114 

123 

126 

133 

140 

60 

Teacher 

360 

no 

117 

124 

132 

140 

75 

Lawyer 

164 

112 

118 

124 

132 

141 

70 

Student, business or public admin. 

152 

114 

118 

124 

131 

140 

65 

Clerical 

Statistical Clerk 

72 

114 

119 

125 

133 

141 

7.0 

Bookkeeper, general 

302 

108 

114 

122 

129 

138 

75 

Chief Clerk 

297 

107 

114 

122 

131 

141 

85 

Stenographer 

206 

109 

115 

122 

130 

139 

75 

Tabulating Machine Operator 

61 

102 

111 

120 

127 

134 

80 

Clerk-Typist 

616 

101 

no 

119 

126 

136 

90 

Clerk, general 

2,063 

97 

108 

117 

125 

133 

85 

File Clerk 

119 

96 

105 

114 

123 

129 

90 

Stock Clerk 

791 

85 

99 

no 

120 

127 

10 5 

Sales Clerk 

2,362 

82 

95 

109 

119 

128 

120 

Technicians 

Draftsman, mechanical 

99 

105 

in 

120 

128 

135 

8.5 

Tool Designer 

54 

102 

no 

119 

128 

141 

90 

Physics Laboratory Assistant 

125 

96 

106 

116 

124 

133 

90 

Photographer 

70 

88 

109 

114 

124 

129 

75 

Parts Clerk, automotive 

133 

90 

98 

no 

119 

127 

10.5 

Installer-Repairman, Telephone 

& Telegiaph 

62 

98 

108 

115 

120 

1S3 

6.0 
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ILr.US 114 CIVILIAN OCCUI>A.TIONS VND AGCl SCORIS (Contact) 


0 ecu fm lions 

N innber 

Pic 


P.C 

Pn 

Psc 

e 

Mechanical hades 








\ii plane l‘ngine Mechanic 

111 

92 

102 

114 

123 

130 

103 

Tool \Iakci 

117 

92 

101 

112 

123 

129 

11 0 

I'oLcnian, machine shop 

48 



no 




Machiniiit 

G17 

80 

99 

no 

120 

127 

10 5 

Lnpfinc Lathe Opera toi 

283 

89 

101 

no 

120 

128 

95 

Marlunists llclpei 

42^1 

83 

96 

108 

118 

125 

11 0 

hlctti ician, aiiLomotive 

57 

88 

100 

108 

113 

127 

75 

Machine Opciatoi, designated 








machine 

3,011 

77 

89 

103 

114 

123 

12 5 

Truck Dii\ci, hea\y 

3,173 

71 

83 

98 

111 

120 

140 

Truck Diivei, light 

3,966 

69 

80 

95 

109 

119 

14 5 

Construction tiades 








Carpenter, hcavv Lonstniction 

82 

87 

97 

112 

124 

132 

13 5 

Caipenrcr, gcneial 

1,001 

73 

86 

101 

113 

123 

13 5 

lilecti ician 

435 

83 

96 

109 

118 

121 

11 0 

Cabinetmaker 

111 

80 

92 

108 

119 

130 

13,5 

Sti Lictuial Steel W orker 

107 

76 

88 

101 

119 

126 

15 0 

Foicman, const i lu Lion 

281 

72 

88 

104 

118 

128 

150 

Plumber 

222 

71 

87 

103 

111 

123 

13.5 

Painter, gc'iieral 

680 

70 

83 

99 

113 

121 

150 

Const lucLion Machine Operatoi 

145 

70 

79 

97 

107 

117 

14 0 

Ciaiic OpeiatoL 

128 

72 

87 

96 

111 

120 

12 0 

Miner 

502 

67 

73 

87 

103 

114 

14 0 

Students 








Mechanical Engineering 

62 

114 

122 

128 

133 

140 

65 

Medicine 

124 

116 

120 

127 

135 

140 

75 

ChcmicaL L ngmeering 

73 

105 

117 

125 

134 

142 

85 

Business or Public Administration 

152 

114 

118 

121 

131 

140 

6 5 

Sociology, high school, academic 

2.608 

92 

102 

113 

122 

129 

10 0 

Higli School, comnieicial 

275 

90 

99 

no 

118 

124 

9 5 

Manual Ai ts 

60 

87 

99 

109 

121 

132 

11.0 

High School, \ocalional 

504 

83 

96 

108 

115 

124 

95 

Other 








Teamster 

281 

64 

71 

. 97 

JOl 

114 

15 0 

Barlier 

166 

60 

79 

93 

109 

120 

150 

Farm woiker 

7,475 

61 

70 

86 

103 

115 

16 5 


^\dapted b} peinriission of Naomi Stcivait and the cditois of Educational and 
Psychological Mcaswcmeuts ) 


weighted Standard scores having a mean of 100 and SD ol 20 were 
provided lor \cibal, nonverbal, and total scores 
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The correlations between the AGGT and- the following individual 
tests were: 


Story Memory 

.61 

Shoulder Patches 

61 

Similanties-Differences 

.71 

Trail Making 

.65 

Digit-Span 

56 

Cube Assembly 

.54 

Verbal tests 

.78 

Nonverbal tests 

.74 

Total score 

84 



A College Qiiahfying Test (C-i) was given for both Army and 
Navy at fourteen thousand educational centers m April 1943 to 
about three hundred diousand officer candidates. The following data 
are taken from the results of this testing: 

Number 


Test of 

Items Minutes 

I. Verbal Opposites 30 

Analogies 15 

Double Definitions 15 30 

II. Scientific Background Information 40 30 

III Reading of Paragraphs: Economics, History, and Biology 20 25 

IV Mathematical Problems Algebra and Geometry 30 35 

Total 1^ m 


Army Specialized Training Progiiam Tests, During 1943 and 
1944 more than 140 subject matter achievement tests were con- 
structed for use in the Army Specialized Training Program Ap- 
proximately one million of tliese tests were given to one hundred 
fifty thousand trainees in two hundred colleges and universities This 
nation-wide training program of the Personnel Section of the Adju- 
tant General's Office was designed to evaluate both the achievement 
of individual students and the content and quality of instruction. 
New and somewhat equivalent forms were produced every 3 months 
for many of the subjects, because each term of instruction lasted 
3 months, and it seemed desirable not to use the same form twice 
in any one institution For some subjects as many as eight forms 
were produced. The basic-training phase included mathematics, 
physics, chemistry, English, geography, and history. The advanced 
phases included medicine, engineering, personnel psychology, and 
foreign languages. 

The testing program had three important results: (1) Course out- 
lines were revised, made more definite and more uniform, and prob- 
ably more effective in learning. (2) Item analysis indicated the 
difficulty of items and the internal consistency of tests to a degree 
not before attempted. (3) Instruction became more standardized as 
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was shown by correla lions between instructors* grades and achieve- 
ment test scores Ihis was most noticeabJc in the Icss-standardi/ed 
courses, ior example, English and gcogiaphy 

The piedirtion oi success on these achievement tests iroin a college 
quahlying examination deigned to select candidates lor the AST 
program was as high as 71 lor combined achievement stoics in 
mathematics, ph)sics, and chemistr) Instiiictois' grades loi this 
same combiminon oi coiiiscs correlated only fioin 15 to V), showing 
eitlier that the iiisti nctors’ grades were lesb reliable or that they 
measured othci iactors Thc’^c results aie similai to those repotted 
for the Navy V-I2 program by Cnuvlord and Burnham (1915). 

The Army Air Foicc Batteries 

The A VF had two batteries the Ouahl\jng Examination (Davis, 
1947), lor all candidates, and the Air Ciew Classification Battery 
(DuBois, 1017) The iattci was used at classification centcis to assign 
those who had passed the Quality ing Examination to the most ap- 
propriate an crew tiaiiung, such as fighter pilot bomber pilot, 
navigator, bombaidier, radar observer, and fliglit engincci 

The Qitalifynig Exammalion The Qualifying Examination was 
designed lor msc in selecting men v\ho could become good offi- 
cers and pilots It was a power test with a liberal time allow'ance 
(3 liouis) and corrections lor guessing, so that there was no advantage 
m the examinee's marking every item 

Illustration 115 lists the jiarts oi the AAF Ouahiying Examina- 
tion, 1942, 1943, and 1944 The Geneial Vocabulaiy and Contem- 
porary \ITairs Subtests ol the 1912 battery v\ ere replaced bv inlorma- 
tion about diiving an automobile llyiiig and aviation, which were 
also considered to be indirect indicaiions oi strength ol inteiests 
The mathematics problems oi the 1942 battery and the tests oi 1943 
on estimating distances were omitted in later batteries, since they 
failed to bear out the earlier predictions of success m basic pilot 
training Part of this deciease ma) have been due to changes in 
standards oi pilot training The Reading Compiehension and Me- 
chanical Comprehension "Icsts v\cre retained and impiovcd so that 
their validit) correlations were nearly doubled. Two rather difficult 
visual percejDtion tests, Planning Ciicuits and Hidden Figures, were 
introcluccd in 1913, and the latter was retained in 1944 Both of 
these seemed to give bcttei prediction v\hen used as speed tests, 
but in this examination speed was never stiessed 

The correlations in Ulus. 115 show that the AAF Qualifying 
Examination was able to save a great deal oi delay and expense in 
pilot training. From approximately 1,200,000 men who w’ere given 
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ILLUS 115. ARMY AIR FORCE QUALIFYING 'EXAMINATION CORRE- 
LATED WITH GRADUATION FROM BASIC PILOT TRAINING 



Number 

Btsei tal 


of Items 

Correlation 

FORM acIOa, 1942 

Section 

Vocabulary, General 

45 

— 04 

Reading Compiehcnsion 

15 

.14 

Judgment, Practical 

16 

36 

Mathematics Problems 

30 

14 

Contemporary -Vffairs Information 

30 

.24 

Mechanical Comprehension 

15 

29 

Total 

150 

"lo 

FORM AClSl, 1943 

Section 

Planning Ciraiits 

45 

26 

Hidden Figures 

45 

31 

Path Distance 

30 

14 

Point Distance 

30 

.17 

Judgment Reasoning 

25 

.33 

Aviation Information 

35 

.34 

Mechanical Comprehension 

60 

.48 

Total 

270 


FORM ac14l, 1944 

Section 

Reading Comprehension 

16 

26 

Information Drive, Fly, Aviation 

50 

.29 

Mechanical Comprehension 

60 

.60 

Hidden Figures 

25 

36 

Total 

160 

62 


the qualifying test, 100,000 were accepted for flight training A pass- 
ing mark was usually set so that about 36 per cent of men above the 
mark graduated from advanced pilot training, while only 11 per 
cent of those below the mark finished. Before the testing program 
became effective it was necessary to enroll 397 men in order to 
graduate one hundred, but after the qualifying examination was 
used only 180 men were enrolled. If in addition the Air Crew Classi- 
fication Battery was used only 150 men were enrolled. 

Air Crew Classification. The first battery of the Air Crew Classi- 
fication Test appeared in February 1942, and there were ten re- 
visions by June 1945. About 6 hours were needed for approximately 
twenty-one tests, of which from four to six required apparatus and 
the rest used paper and pencil. The later batteries included the 
same broad fields as the earlier ones, but fewer short tests of percep- 
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tion and more o£ technical infoimation and spatial thinking were 
used. BiograpliicaJ data were also included in the hnal scores. 

Stannic scores were compo’sites ol the scores o£ these classification 
subtests, w'eiglited difierently to pi edict success in \anon'> framing 
situations. Stanines are scores on a 9-point scale, each point lepic- 
scnting a range ol scores oi one hall a standard dcMation In this 
way the fust stanine ahvavs includes the lowest 1 per cent ol the 
gioup, the second stanine the next 7 per cent, etc (Ulus 116) 


ILIUS 116. COMPARISON OF STAXINES. (.EN I ILES, AND AREAS 



The faist stanines for the earlier batteries weic computed from 
estimates of judges As soon as \alidation daia w^ere obtained, 
stanines weie computed by using multiple icgression weights Be- 
cause It was desired that stanine scores should be as independent of 
each other as possible, different tests as well as different w'Cights on 
the same tests were used to some degree The correlations between 
stanines in the 1944 battery langed from 50 between na\igator and 
fighter pilot, to 90 betw'een hghtci and bomber pilot The heaviest 
loadings for na\igatoi wcie Numerical and Vcibal, for bombaidier, 
Perceptual, Numerical and Spatial, loi pilots, Psychoiiiotor, Co- 
ordination, and Mechanical Experience, and loi officer quality. 
Verbal and General Reasoning, Flanagan and Ritts (191^1) 

The AAF Air Crew Classification 'Fest w’as used with approxi- 
mately SIX hundred thousand candidates The piedictioii b) pilots* 
stanine score:, ol giaduatiori or elimiiiaiion horn clcmentaiy pilot 
tiaining (l)uBois, 1917) w'as calculated for twenty-eight large groups 

01 trainees over a period ol 3 years The earliest biseiial correla- 
tions w^ere 31, 33, and ^1, the latest, 70, 46, and .03. Elimina- 
tions were computed for each stanine on primary, basic, and advanced 
pilot training among about filly thousand cadets during a period of 

2 years Eighty ptr cent of the first stanine were eliminated, but only 
about 13 per cent of the ninth stanine In 1911 and 1915 prediction 
by bombardier stanine score of graduation from the bombardier 
courses ranged from 27 to 30, and similar figures lor navigator tram- 
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ing ranged from .45 to 50 for large groups, but from .35 to .79 for 
smaller groups None of these predictions was improved significantly 
by the addition of measures or estimates of interest, of temperament, 
or of adjustment. 

Navy Batteries 

The United States Navy Basic Test Battery (Stuit, 1947), first ad- 
ministered in 1943, had six parts, which were not selected directly 
from factorial analyses nor intended to be pure tests. The second 
and third forms were revisions based in part on item analyses and 
in part on a selection of items which correlated most highly with 
total scores on the subtest of which they were a part. The tests were 
designed to predict success in specific courses, hence they included 
tests of knowledge in specific fields, mechanics and electricity, spell- 
ing, and a reading test on material related to navy life The composi- 
tion of each of die six tests was, in brief, as follows: 

GCT. The General Classification test was a general word knowledge test 
composed of 30 completion items, 30 opposites, and 40 analogies. 
The latter are usually classed as a reasoning test 
READ The Reading test was made up of 30 5-choice items based on short 
paragraphs 

ARI The Arithmetic Reasoning test had 30 verbally stated problems. 
MAT. The Mechanical Aptitude test consisted of 45 Block Counting 
items, 44 Mechanical Comprehension, and 40 Surface-Development 
Items. 

MKM. The Mechanical Knowledge, Mechanical, contained 75 items, of 
which 35 were pictorial and 40 written 
MKE The Mechanical Knowledge, Electrical test consisted of 60 items, 
of which 25 were pictorial and 35 written 

Three special-aptitude tests were added later to round out the 
battery: 

CLER The Clerical Aptitude test had 55 alphabetizing, 83 name-compar- 
ison, and 75 number-comparison items. 

SPELL. The Spelling test had 50 items, in each of which one of five words 
was misspelled. 

CODE. The Radio Code test — ^speed of response, which was developed 
for the Army and Navy by the National Research Defense Com- 
mittee, consisted of a learning unit m which three characters were 
taught, and then a unit used to test candidates m receiving the 
tliree characters, but at four different speeds 

The Spearman-Brown reliabilities for Form 2 ranged from .88 to 
.95 (median 90) and the alternate form reliabilities ran about 6 or 
7 points lower. The subtests correlated with age from — 09 to .19 and 
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with highest grade completed from .39 lor MKM to G5 for both CST 
and CLFR (median 56) The intercorrelations ol siibtests on toim 
2 langed lioin 33 between x\IKM and SPELL, to 85 between GCT 
and RE \D (median about 66) The GCT had the highest median 
coirelatioii uitli the seven other siibtests, and was a close 

second CODE was least related to the otheis The^jC showed much 
higher mrerconelations than are tonsideied desnable ior independ- 
ent tests, but ilie predictions ol succc'bs w ere liigh enough lo be usclul 

'Ihe prediction ol final grades in seven tv pcs oi elemental tiaining 
from SIX tests ot the Basic Test Batteiy v\eic reported tor gioups ot 
a thousand or moie "J'hese shov\'ed that lor all the schools together 
the \rithmeiic Reasoning Test gave slightl) better jn'cdictioii than 
the others (about 50) and the Reading lest slightly lower predic- 
tion (37). The Mcchaniral Knowledge Test (MKM) gave the best 
single piediction (64) ior aviation machinists’ mates The Mechan- 
ical knowledge. Electrical (MKE), and Arithmetic Reasoning (ARl) 
Tests predicted electrical training success most highly ( 57 each) For 
diesel naming the MKE gave the best prediction (46) and lor ma- 
chinists* mates the Arithmetic Reasoning (ARI) prediction (49) was 
slightl) ahead of the icsl In basic engineciing the Arithmetic Rea- 
soning (ARI) Test predicted success with a 60 These are all signifi- 
cant predictions Ior group success 

Of '16,500 trainees rssigned to ten navy schools, eleven thousand 
were assigned who had scoics below that recommended, iisuallv be- 
cause of quota picssuic. The pcicenlage of failure due to Jack ol 
aptitude among the eleven thousand was about V/j tunes as large as 
among those above the recommended cut-oil score The percentage 
ot failiiie due to lack of interest showed a similar pattern (Chapter 
XXIl) 

A Cl i tenon of performance on shipboard was secured by lank, or- 
der, and rating, adjusted to eliminate amount of experience Separate 
rankings were made ot three characteristics: petty olliccr qualities, 
technical competence, and over-all desirabilit> The correlations be- 
tween them were so high ( 80 to .95) that only the technical compe- 
tence ratings were used as ciiteiia of success 

A sample of 1,868 men on 27 diflererit ships — 9 destroyers, 12 
carriers, and 6 cruisers — was studied Six naval ratings or occupa- 
tions w'cre reported separately The results horn the six tests of the 
basic battery were given in detail lor radio mates and signal mates 
average correlations ol success with technical competence wcic high- 
est for the General Classification Test and .\rithmetic Reasoning 
(about 31) Technical competence ot radar operators wus predicted 
best by Arithmetic Reasoning, .44, ot fire control men by Mechanical 
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Aptitude, .33, of machinists' mates and gunners' mates by Mechan- 
ical Knowledge, .36 There were smaller but usually significant cor- 
relations between years of civilian education and test scores, and be- 
tween civilian education and criteria of success The correlations 
with the other tests in the battery were ail lower (from — 11 up). 

The United States Navy also developed the following series of tests 
for officers: 

1. The Officer Qiialification Test, Foim 2, was a one-hour test 
of one hundred items which included 50 Verbal Opposites, 30 Me- 
chanical Comprehension, and 20 Arithmetic Reasoning items. It was 
used as a screening test for civilian applicants for commissions. 

2. The Officer Classification Test, Form X-i, was an aptitude test 
containing 255 items. It was more difficult than the Officer Qualifica- 
tion Test and had relatively shorter time limits It included: 



Number 

Time 


of 

in 


Items 

Minutes 

Verbal Opposites 

60 

60 

Meclianical Comprehension 

45 

20 

Electrical and Mechanical Information 

45 

10 

Mathematical Problems 

45 

45 

Block Assembly 

30 

15 

Rotation of Solid Figures 

30 

20 

Total 


170 


The United States Navy activities in defining goals and measuring 
achievement both for officers and for enlisted men (Stuit, 1947) were 
outstanding. For instance, among the Elementary Enlisted Schools 
both paper-and-pencil tests and performance tests were developed 
which measured the important skills and knowledge in each course. 
To conserve time, multiple sets of equipment were used, routine 
operations were omitted, and key or difficult aspects were stressed. 
For objective scoring proctor's sheets were made specific. This re- 
quired agreement on the most acceptable procedures and often re- 
sulted in conferences at which procedures were improved. 

COMPARISON OF MILITARY BATTERIES 

Four batteries are compared in Ulus. 117, two from the Navy and 
two from the Army Air Force. These were chosen because they were 
devised to give analytic results Illustration 117 shows that in each of 
these batteries, tests were included to measure nearly all of the pri- 
mary factors given in the column on the left. The time limit ranged 
from 130 to 229 minutes of actual working time. The AAF Air Grew 



MU IT ARY DEVELOPMENT OF TESTS 


343 




S44 


ACHIEVEMENT AND APTITUDE 


Classification Test showed more specialization of contents than the 
others No figures have as yet come in to show the correlation be- 
tween the parts of different batteries which have been designed to 
measure somewhat similar skills. It seems probable that certain of 
these tests, for example. Vocabulary and Reading, will correlate 
highly, whereas in tests of Spatial and Mechanical Knowledge there 
may be large differences in content and emphasis, which will prob- 
ably yield low correlations. Similarly, in the Perceptual Speed Test 
the type of objects and the methods of procedure will probably pro- 
duce different results. A comparison of this table with Ulus. 85 shows 
that these military batteries and the general-aptitude batteries over- 
lap to a large extent In the future it will be possible to combine 
these various batteries so as to predict success fairly accurately in a 
large variety of civilian and military occupations. 


MILITARY PERSONALITY INVENTORIES 

Ellis and Conrad (1948) reviewed seventy-six references on the 
application of personality inventories in military service. They com- 
pared various inventories with two types of criteria- (a) psychiatric 
classifications, both before and after induction, and {b) success, as 
shown by either graduation from a course of training or ratings of 
performance on die job Approximately 40 per cent of the reports 
concern officers or enlisted men in the United States Army Air Force, 
32 per cent Navy, 16 per cent Army, and 12 per cent selectees. A 
large variety of inventories was used, but most of them were short 
screening devices developed by the psychologists in the Armed Serv- 
ices. For example, approximately 40 per cent of the studies used 
the Personal Inventory (Shipley, 1946), and 29 per cent the Cornell 
Service Index (Weider, 1945), both of which consist of short lists of 
questions regarding one's own health, habits, worries, and adjust- 
ments. The other studies included the Bell Adjustment Inventory, 
the Minnesota Multiphasic Personality Inventory (MMPI), the 
Huram-Wadsworth Temperament Scale, and the three Guilford- 
Martin Inventories, 

The inventories, even the short ones, proved to be fairly effective 
as rough screening devices among selectees For example, the Cornell 
Service Index cut-off score could be set to detect from 71 to 89 per 
cent of the men rejected as the result of neuropsychiatric interviews, 
while^ indicating about 15 per cent of the ‘‘false positives," that is, 
men judged to be unstable by the inventory, but not by the inter- 
view. The cutting scores could be set to detect from 60 to 70 per cent 
of those disenrolled for neuropsychiatric reasons, while including 
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froiTi 2 to 13 per cent of’ the false positncs. Fiom 50 to 69 per cent 
oL the ps^chitiLric clisch£Ugc«> from the Nav) weie ideiitihccl by the 
Personal ln\cnior\, iiicliidiiig Iioni 1 to ] I per cent of laisc posuncs 
The shorter scales did not aireiiipt to identity the t\pe ol clinical 
syndrome The MMPl was lepoued h\ some authois hut not by 
othcis, to ha\e laiily close coi relatjons w’lth cJinual diaguo'scs 

Success m completing a roiusc ol tiaming oi in pciloinianrc in 
combat was not w'cJl predicted by any ol the inventoiics For ex- 
ample, iailures in schools lor primary pilot training and loi ad- 
\anccd pilot training and in schools lor submaiinc men, parachute 
soldiers, marine-officer candidates, navigators, ladar opcratois and 
bomb.ndiers were seldom predicted with a significant coi relation 
The highest coi i clarions repoi ted w eie 18 loi a gi oup ol 1 ,039 officer 
candidates and 39 loi 1,070 paiachute tiainees but most ol the cor- 
relations were much lower Ratings by subniaiiiie olficeis ol the per- 
formance ol their men showed coiiclatioiis not significantly dillercnt 
from zero with Shipley’s Personal Inventory (Satter, 1915) Ratings 
oi 185 manne ofTiceis by supeiior olficeis on combat pioficienc) 
show ed a con elation of 15. 

Why should these inventories detect maladjusted poisons as re- 
vealed bv neurojjsychiatric interviews beloie and altei enlistment, 
but kill to indicate success in training or in combat^ IVo reasons are 
advanced b) Ellis and Coni ad (I9i8) first, the groups in training 
and in combat were so highly selected that there were lew, il any, 
seiioiisly maladjusted persons in the^e groups, second, the less vvell- 
adjusted poisons may have been '>tiongIv motivated to make a poor 
showing on the inventor) so that they would be dischaiged or hos- 
pitalized Theic IS also evidence that occasionally the inventory 
scoics weie av^a liable and used inloimally as pait of the basis for 
a psychiatiic classification 

From such studies as these it seems sale to conclude that foi sci ceil- 
ing a large uiisclectcd adult gioiip a short personal invciuoiy will 
detect about 75 per cent ol the men who should be rejected horn the 
Armed Forces for neuropsvchiatiic reasons It cannot be assumed 
from these studies, however, that the short personal inventory will 
be ellectivc in civilian personnel work. Widi higlil) selected groups 
or wuh those who are motivated to misrepresent themselves it would 
piobablv not be ol value AVith other gioups some items after a de- 
tailed analysis might be found to have significance. 

ASSESSMENT OF MEN 

One of the most extensive and intensive applications of role play- 
ing to the stud) ol pcisonalitv factors was made by the Office of 
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Strategic Services (OSS) which was set up by the President and Con- 
gress to meet the special conditions o£ World War II. The functions 
of this office was (a) to collect and analyze information concerning 
the activities of the enemy nations, and (h) to conduct operations be- 
hind the enemy’s lines by aiding and training the resistance groups 
by means of the radio and pamphlets, and in other ways. Approxi- 
mately five thousand members of the Strategic Services organization 
were appraised intensively over a 3-day period at one station, or 
during one day at another. It was decided to assess each man pri- 
marily on a cluster of dispositions, abilities, and traits which were 
thought to be essential to the performance of almost every OSS job 
ovei'seas. A list of about twenty variables was finally reduced to the 
following seven: 

1. Motivation for assignment 

2. Energy and initiate e 

3. Effective intelligence ability to select strategic goals and the most 
efficient means of obtaining them, resourcefulness, originality, good 
judgment in dealing with people 

4 Emotional stability, steadiness under pressure 

5. Social relations good team-play, freedom from disturbing prejudices 
and annoying traits 

6 Leadership, social initiative, ability to evoke cooperation, acceptance 
of responsibility 

7 Security ability to keep secrets and to use discretion 

In addition to these seven variables, three others were occasionally 
used as required. 

8. Physical ability* ruggedness, stamina, agility 

9. Observing and recording being able to evaluate information and to 
record it very accurately 

10. Propaganda skill* ability to see enemy vulnerability and to devise 
subversive techniques of some sort or other 

In order to appraise the various candidates on these traits, six 
different methods were used: 

1 Interviews, which were both formal and informal, over a period of 
several days 

2 Observations tliroughout a 3-day period 

3. Individual-task Situations, where a single candidate had to deal with 
one or more persons in achieving his end 

4. Group-task situations, where a team of candidates was instructed to co- 
operate in performing a task 

5. Projective tests which revealed some of the inhibited tendencies of the 
candidates 

6 Ratings by associates, in which the candidate’s skills and his acceptance 
by his coworkers were noted 
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Each of ihcsc methods was subdivided Foi insrance, ini the in- 
dividual-task situations, each candidate had to direct two assistants 
in helping him to erect a wooden structuie The assistants w'cre 
stooges who were instructed to be recalcitiant. Again, the candidate 
had to inter\iew' a person ^ipplying for a position in a seciet organiza- 
tion, or two candidates had to deal with each other in lace-to-lacc 
situations wdiich wcie soiiicw4iat prcsciihed In oidei to make the 
test situations as eflectne as possible, thc) were made similar to 
those which would be met in actual i\ar conditions For e\*imple, 
in appraising leadership a group was taken to a load wdicre one 
candidate in the piesence of a whole gioup wms told to take charge 
of the situation men had blow^n up a bridge a mile tn\'ay, and he 
must meet a truck a mile away in another direction with only 10 
minutes to spend in getting across tins road The load Jiad been 
mined w'lth a new type of sensitive mine which he w’ould not be 
able to neutralize or dig up The road was assumed to lie betw’ccn 
two w’liite lines, and the leader and his men w'ere permitted to w'Oik 
up and dowm thc road as lar as the white lines extended. The de- 
struction ol thc bridge had aioiiscd thc enemy, but they did not know 
in which direction the men w'ho blew^ it up had gone 

Since each situation furnidics evidence for more than one ol the 
traits under consideiation, a 6-point tating scale was adopted lor 
each vaiiable as follow’S \ei} poor, 7 per cent, inlciior, 18 per cent, 
low aierage, 25 per cent, high a\eiage, 25 per cent, stipenoi, 18 
per cent, vciy sui^crior, 7 per cent The jici cent ages indicated the 
proportion of men who would lall in each categoiy if the lariable 
happened to be normally distributed in the population of candi- 
dates Ey combining one oi more of the categories, this scale was 
sometimes convcitcd into a 2-pomt, 3-point, or '1-point scale 

A thorough statistical anahsis of results shows the correlations 
between thc vaiiables Correlations between motivations for assign- 
ment and other variables weic as lol low’s 


social relatjons 

4j 

security 

23 

eiierg) and initiative 

44 

cffccti've intelligence 

22 

emotional stability 

43 

physical ability 

22 

leadership 

36 

observing and recoi ding 18 

propaganda skill 

35 




Likewise, the correlations of these assessments wnth appiaisals of 
actual success in the theater ol w’ar w^re found for small samples 
These correlations indicated roughly thai effecLnc intelligence was 
the best single predictor ol success All the other variables show’ecl 
correlations not significantly different from zeio. The samples avail- 
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able for this validation study were, however, usually small, and the 
methods of appraising workers’ effectiveness were probably not as 
well-standardized or as reliable as the methods used in the original 
assessments Effective intelligence was measured not only by test, but 
also by behavior in a stress interview, in discussion and debate, and 
by practical judgment and leadership. For practical purposes the re- 
sults of these assessments were always reported to the commanding 
officers in simple and nontechnical terms 

CONTRIBUTIONS FROM MILITARY EXPERIENCES 
WITH TESTS 

Flanagan (1948) has summarized some of the most important con- 
clusions from the AAF programs under the following headings: 

Relative Importance of Aptitude and Training 

The comparisons of many groups in many training situations 
brought out great differences among instructors in the same school 
and even greater differences between schools. The program expanded 
so rapidly that civilian and military training officers were given a 
large amount of freedom in developing their training programs. By 
comparing the training in one situation with success in later situa- 
tions, it became apparent that success in both basic and advanced 
training was determined to a much greater extent by aptitudes, as 
shown on the Air Grew Classification Tests, than by the instructor 
or by the quality of his training This was true of both pilots and 
bombardiers One should not generalize too freely from these re- 
sults, however, a great deal more research on the content and effec- 
tiveness of specific training activities is needed. 

Test Forms 

It was found that by using sufficient ingenuity all paper-and- 
pencil methods could be set up m multiple-choice form suitable for 
machine scoring By using photographs of instrument faces, of model 
airplanes, of terrain taken several thousand feet in the air, and the 
like, it was possible to obtain sample situations. Even some psycho- 
motor functions and reactions to motion pictures were successfully 
adapted to answer sheets. 

Apparatus tests were employed to a much greater degree than ever 
before by having timing and counting accomplished by means of 
electrical devices conveniently arranged on the examiner’s control 
desk In this way one examiner could administer complicated co- 
ordination tests to four or more persons at once. 
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Test Content 

In order to measure unique, stable, and important traits, tests 
were designed to have one principal factorial loading on skills that 
had been learned or practiced over a long period of time and which 
showed considerable prediction of success of a particular kind Suf- 
ficient practice was given preliminary to the test proper to allow for 
“warming up” and for the development of skill in the test situation. 
Both speed and power tests were found to be of value. 

Statistical Procedures 

Batteries of tests of independent traits were found to be more 
effective and more economical in predicting success than tests which 
had not been analyzed, and probably ‘included a \ariety of unknown 
factors in unknown amounts. In order to provide such tests, not 
only the total scores but also each item of the tests had to be evalu- 
ated for factorial purity and difficulty. Great economy in the statis- 
tical procedure was accomplished by the use of parts of a group 
rather than the whole, and b) standard forms and machine methods. 
It was iourid advisable to use a second sample in item aiiahses to 
avoid \arious kinds oL systematic and random variations (Using a 
second sample to confirm the first analysis is called noss iialidalio?i.) 

Job Requirements 

A great deal oi emphasis was placed on defining job lequiremcnts 
in terms of abilities, interests, and adjustments, and on indicating 
their lelative iinpoitancc It was found that boih w’orkers and sujjer- 
visors were inaccurate with regard to such analyses Then judgments 
were usuail) vague stereotypes, and they coiiiuscd ability w’lth moti- 
vation To analyze jobs, considerable training is needed in defining 
independent traits and giving actual examples and m piactitc under 
supervision The value ol a judgment depends to a large extent upon 
the judge’s knowledge ol individual differences in the traits under 
consideiation 

Job analysis intisi determine evidence of the possession of critical 
rctjuirciiiciits — those that make the difference between success and 
failuie in important aspects of the job Hicse can best be determined 
])y studying tlic causes ol good and poor performance Valuable 
evidence ol this kind was given by individuals concerning then own 
enors and concerning the cllcctive and ineffective acts oi their supet- 
visois, Ol those whom they supervised 

The best evidence of ciitical job requirements will come from a 
follov\-Lip study in which traits arc caielully measured belorc liam- 
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ing and checked later against success. This kind of study is found to 
be extremely difficult It is dangerous to assume that because suc- 
cessful and unsuccessful individuals differ with respect to a traic 
now, they probably showed similar differences before they were 
selected for training. Failure may affect the trait rather than be 
caused by the trait 

Criteria of Success 

The important problem in all validation is the securing of good 
measures of success. To be effective these criteria must evaluate in- 
dependent skills separately, and for actual operations rather than 
for training situations The most relevant, reliable, and unbiased 
criteria seemed to be objective measures of combat proficiency, but 
these were often highly related to opportunity. Ratings based on 
direct and systematic operations were fairly useful, but ratings based 
on general impressions, reports, or incidents were the least valuable 
Reliability was considerably increased by focusing the rater’s atten- 
tion on one well-defined trait while comparing all individuals in 
a group. The forced-choice technique where raters were not able to 
determine the scoring procedure (Chapter XVI) was effective in cer- 
tain situations. 


STUDY GUIDE QUESTIONS 

1 In what ways did the objectives of testing programs in military es- 
tablishments differ from those of testing programs in educational systems? 
In what ways were they alike? 

2 How did the Army General Classification Test differ from the Navy 
Basic Test Battery? 

3 What percentiles of the military population are represented by scores 
of from 110 to 129 on the Army General Classification Test? 

4 What relations were found between AGCT scores and highest grade 
completed in school, the ACE Psychological Examinations, and the Army 
Alpha Test? 

5. What relations were found between scores on the AGCT and success 
in various types of military training? 

6. What is the significance of the AGCT occupational norms? 

7. How did the AAF Qualifying Examination and the Air Crew Classi- 
fication Test differ in purpose and composition? 

8 How were stanme scores for the various air crew positions computed? 
What predictive value did they have? 

9. How well did the Navy Basic Test Battery predict success in navy 
training and success on board ship? 

10 Compare the content of military batteries with the content of achieve- 
ment batteries (Ulus 57) and aptitude batteries (Ulus. 86). 
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11 How cficciive was thb use of slioit pcisoiiality iiivenLonos jii milirary 
selection^ 

12. How did the OSS appraise behaMor — paiticiilarly behasioT under 
suess- 

13 Wliat CMdencc is iherc that balieiics of tests of independent tiaits 
wcie inoie economical and effective than uuanaUzed tests^ 
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CHAPTER XII 


THE INTERPRETATION 
OF SCORES 




In ihe pre\ioiis chapters the vaiaous ways of sainpling a pci'son’s be- 
havioi b) recoiding lus icsponses to particular types of test items 
ha\e been discussed The total number of credited responses, called 
the laiu 6coie, icpiesents the pciioimance of a pcison in a test situa- 
tion The raw score, however, is ot little use by itself I'oi instance, 
no one is much the wiser by learning that Frank’s score in a \ocabu- 
lar\ test was t wen tv words correctly defined The raw score is given 
sigiiific*incc only b} comparing it with othci scores One of the most 
important lacts to know about a person is his position in a standard 
group, foi this shows how well he is equipj^ed to meet competition. 
This chapter dcsciibes seicial common w'ays of interpreting raw 
scores in teiiiis of ones relative position in a group Furthennoie, it 
show’s how groups can be mcastiied and compaied. 

THE LIMITS OF A SCORE 

Since a scoic on a test is only an appioxiinate indicator of the re- 
sults of an observation, it has variable limits Within the limits of 
the piccision oi the measuring instrument a test score is considered 
to be a midpoint of a scale unit For example, in measuring the 
height ol a pcison many times with a ruler, one is likely to make 
errors and to record \alLics abo^e the true length as often as below 
it Hence, any obtained score such as 50 is usually thought of as 
repieseiilmg values from 49 5000 to 50 49999. The midpoint of this 
unit IS 50.0. 
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FREQUENCY OR BISTRIBUTION TABLES 

The immediate result of testing a group of persons is simply a pile 
of corrected test sheets^ or a list of scores, see Ulus 118 In order to 
compare these scores conveniently, they are arranged in a frequency 
table, a table which shows the number of times each score was made. 
The number of persons who have made a particular score indicates 
the frequency of that score. Illustration 119 shows the steps usually 
followed m making a frequency table. First, one finds the range of 
scores, which is the difference between the highest and the lowest 
scores in a group If the range is large, it is not advisable to count the 
number of times each score was made. Unnecessary work is elimi- 
nated by tabulating scores m small groups, called class intervals. 
Since It has been found that ten class intervals allow one to calculate 
group norms about as accurately as a larger number of class intervals, 
the range of scores is divided by 10 to find the size of the class in- 
terval. For instance, the range of scores in Ulus. 118 is 71 — 27 = 44, 
and this divided by 10 is 4.4. To simplify tabulations the next 
highest whole number — ^5 — ^is taken as the class interval to be used 
The scores are then written as in the first column of Ulus 119 show- 
ing the limits of each class interval. Following a common usage for 
scales of distances or time, the lower limit of each class interval is 
often used alone, as in the second column Ulus. 119 These limits 
are usually selected so that they are multiples of the interval chosen. 
The smallest scores are put at the lower end of the column, and the 
highest at the upper end The number of persons whose scores are 
found in a class interval is indicated by the tab marks in the third 
column, and by the frequencies in the fourth. 

HISTOGRAMS AND FREQUENCY CURVES 

It is often easier to understand distribution tables when they are 
pictured, and one of the best ways of picturing them is by drawing 
a histogram, A histogram is drawn on graph paper with the base 
line divided into class intervals, and a standard area above this line 
IS allotted to each person Thus, in Ulus 120 each person is repre- 
sented by one small rectangle. The whole figure represents the data 
shown in Ulus. 119. A frequency curve is a histogram which has been 
smoothed according to some method, usually by connecting the mid- 
points of adjoining columns, as in Ulus 120. A frequency curve is 
supposed to be a slightly truer picture of a distribution than a histo- 
gram, since the cases in a class interval are seldom actually distributed 
evenly over the whole interval, as they are in a histogram. On close 
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IIXUS. 118 RAW SCORES MADE BY 100 STUDENTS ON A NUMBER 
COMPARISON TEST 


35 

40 

50 

43 

60 

53 

45 

50 

71 

43 

45 

60 

36 

60 

55 

47 

42 

45 

55 

52 

46 

44 

51 

62 

44 

52 

33 

44 

63 

37 

39 

62 

66 

39 

69 

58 

43 

44 

38 

44 

56 

39 

47 

41 

40 

36 

40 

48 

55 

58 

50 

47 

45 

42 

50 

30 

39 

33 

51 

36 

57 

60 

32 

69 

53 

50 

65 

53 

48 

27 

40 

47 

35 

58 

63 

53 

64 

35 

54 

49 

45 

62 

45 

33 

43 

61 

64 

16 

42 

49 

53 

57 

54 

59 

48 

30 

50 

49 

29 

37 




Number of Scores 

(N) 

= 100 







Sum of Scores 

(2) 

= 4796 







Mean 


(M) 

= 47 96 





ILLUS. 119 FREQUENCY TABLE OF SCORES IN ILLUS 118 
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mi 

mi 
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11 

17 
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35 
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21 

30-34 
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mi 
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8 

25-29 

25 
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N IS ihc sum of column (4) 100 

CalculaLioii of OiidiLiIcs and CciiLiIcs 


p 7 -i ~ 75 l]i ccnlilc = ‘Ird ciuartile 
Pgu = jOth cell tile = ineclidii 
Pg- = 2jihcentile - Ist tjiuiinlc 
Q IV^— Pm J ) 33 — 40 67 


Pm 

Pio 

Sk 


2 ” 2 
- OOthcciitilc ^ 'SO > -I (V12)'7 
lOth centile - 34 j -r (2/13) > 


= 54 5 
= 115 
= 39 5 

= 7 33 

= 6158 
= 35 27 


-L- (2/12)5 = 55 33 
4- (12/18)5 - 47 83 
!- (1/17)5 = 40 67 


n\- Pio\ , 96 83 

— Skewness = ^ ^ j — P,o = ^ 


inspection they aie found to be distributed une\ciily, with more 
cases falling at ihat end o( the inlci\al A\hicli is neaier the center of 
the whole group ^Vhen a frequency cur\e has a large number of 
class intervals, and represents a large iiumbei of cases, the sutlace 
will be smooth indeed It can then be consideied a continuous cursed 
line, as in lilus 121. 
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ILLUS 120 HISTOGRAM AND FREQUENCY CURVE 
OF SCORES IN ILLUS. 118 


Persons 



Note' All numbers refer to the lower limit 
of their clsss intervals. 


NORMAL FREQUENCY CURVE 

Frequency curves have been compiled for literally thousands of 
cases in hundreds of biological, physical, and psychological studies, 
and a striking similarity among these curves is the rule A large num- 
ber of them closely resemble what is known as the normal curve of 
distribution or the Gaussian curve. The normal curve (Ulus. 121) has 
been found in many instances to be typical of large samples of 

1. Mechanical variations, such as the distribution of heads and 
tails shown after tossing coins, combinations of playing cards, 
and numbers in wheels of chance 

2. Mathematical combinations, such as polynomial expansions 
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3. Physiological pliertomena height, weight, pulse late, and os- 
sification 

4. Social phenomena number oC books in private homes and at- 
titudes toward various social and political institutions 

5. Mental abilities. inConnation and skills in school subjects, in- 
telligence tests, occupational skills and cnois in obsei\ation 

ILL US 121 NORM VL CURVE 



This curve always has the same relation bct^vccn X and ^ \ allies I luis, if the 
height at the mean is 100, then the height at -l oi — one SD is 60 The pro- 
portion of the area uliirh lies between the cmvc, the base line, and any two 
perpendiciilaib can be calculated with gieai accuiacy 


The shape of this curve, which has been studied closely, is worth 
noting, for its charactei istics have led to the development ol an 
important scaling technique First, observe that the greatest Iic- 
quency of scores is at the midpoint iToni the midpoint the sides 
slope down, slowlv at first, then more rapidly until a point in the 
curve, called the inflection point, is readied, whence the curve flat- 
tens out gradually The cuive continues until it becomes almost 
parallel with tlie base line Theoretically, the curve would never 
touch the base line it the cases were infinite in nuinbei, but it would 
continue to appioach the base indefinitely. Actually, there are al- 
ways highest and lowest scoies in the measurement of individuals in 
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a real group. The normal curve, whose dimensions are constant and 
well-known, has become a standard with which group measures may 
be compared. 

OGIVE CURVE 

Another useful picturing of a distribution of scores shows the num- 
ber of persons whose scores fall above or below a score (Ulus. 122). 
This is known as a cumulative frequency curve ^ or ogive cwve. The 
cumulative frequency, shown in the fifth column of Ulus. 119, is se- 

ILLUS 122. OGIVE OR CUMULATIVE FREQUENCY CURVE 
OF THE SCORES IN ILLUS. I2l 

Per No. of 
Cent Cases 



25 30 35 40 45 50 55 60 65 70 75 


Score on the number comparison test 

cured by adding the frequency of each class interval to the sum of the 
frequencies of all the intervals below it An ogive curve is made by 
laying off the scores on the base line, or abscissa, and the total num- 
ber or per cent of cases on the ordinate, or vertical line The number 
of cases in the lowest class intenal is indicated b) a dot o\er the 
score which is the upper liuul of the loicesl class interval The upper 
limit IS used to indicate the number of persons uho fell below this 
score The number ot persons in both the fust and second class in- 
tcn'als are added together and indtcated by a point over the upper 
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limit of the second class interval Similar piocedures are used Cor the 
other intervals. Each point on the curve then leprescnts the number 
of persons ivliose scoics fall below a particular sroie 01 coiuse the 
number falling above any scoic ran also be quickly found from this 
chart. 


RANKS, GENTILES, QUARTILES, DECILES, 

AND LETTER GRADES 

One of the simplcsl ways oI describing a person’s position in a 
group is to indicate his innh when the uicmlrcri ol the group arc 
arranged in order ot si/c of scores Ranks are valuable lor nicbcating 
how many persons lereived scores that are abo\e that ol a given per- 
son. This is all that one needs to know il he is considering test ic- 
sults in situations, such as assigning a definite number of scholai ships 
or filling a dcTimie number of positions When knowledge of rela- 
tive excellence is desirable, however, as foi a wider distiibution of 
grades or other awards, it can be secured by the use of centiles. The 
centile shows ihe proportion of the group which falls below' a gisen 
score. Thus, a teiitile ol 75 means that one did better than 75 pei cent 
of the group, bui not as w'ell as the highest 25 per cent Gentiles nia> 
be read directly lioin an ogi\e curse, such as that shown in lilus 
122, or thev may be calculated, as in lllus 1 19 Gentiles, also called 
percentiles, ha\e as their symbol P, witli a subsciipt to indicate the 
particular cenide To calculate the 75ih centile score (P..) the score 
of the 75th person ^ horn the bottom must be found, foi tlierc are 
just one hunched persons in tins gioup ■* 

The cumulative frccpicncies in the filth column of Ulus 119 shov\ 
that 73 scores fell below the class interval 55 to 59 99 Hence, two 
cases in this mteival arc needed to icach the 75ih in the group It is 
customary to calculate this score in all large groups by inie) polaLion, 
which is the process of estimating a value berw’eeii two given values 
by the use ol a ratio In this case, the given values are the limits ot 
the class interval, and the ratio is since tv\o of the twelve cases 
in this class interval are needed Thcicforc P_- i'* found by adding 
%2 size of the class intcival, 5, to the lowei limit of the class 

interval. 

P„ = 515-j-(2(j)5z= 55 33 

iThis calcuUilion In* moic picLi^c i£ Lhc scoie wcic found which lies 

halfway between the 7 ')l1i and 7Gtli pcibon Tor usual woik this ichnciuciu is not 
considered necessary 

2 If the number were 160, the would he the score of the 120 l1i person, 
(160 X .76 = J20) 
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Gentiles are useful indicators of a person’s position in his group, since 
nearly everyone with as much as a sixth-grade education can under- 
stand them. They are probably the scores used most frequently for 
adults. Illustration 123 allows raw scores to be changed rapidly to 
centiles by reading the centile value from the top of the column. For 

ILLUS 123 CONVERSION TABLE FOR MICHIGAN SPEED OF 
READING TEST 


Forms 1 and 2 


Items Correct 

Detroit Sample 

Raw 

Score 

30 

35 

40 

T Score 

45 50 55 

1937 Revision (7 mm ) 
N = 3302 
Mean 

60 65 70 % 

Grade 

Age tn 
School 

2 

7 

16 

Centile Rank 
30 50 70 

84 

93 

98 

Cor- 

rect 

Senior 

21-0 

— 

41 

45 

49 

53 

57 

61 

65 

69 

73 

98 

Junior 

20-1 

— 

38 

42 

46 

50 

54 

58 

62 

66 

70 

97 

Sophomore 

19-1 

— 

36 

40 

44 

48 

52 

58 

60 

64 

68 

95 

Freshman 

18-1 

— 

31 

35 

40 

44 

49 

54 

58 

62 

66 

94 

12 0 

17-2 

— 

30 

34 

38 

42 

46 

50 

54 

58 

62 

96 

11 0 

16-3 

— 

27 

31 

35 

39 

43 

47 

51 

55 

59 

96 

10.0 

15-3 

— 

23 

27 

31 

35 

40 

45 

49 

54 

58 

96 

9.0 

14-5 

— 

18 

22 

27 

31 

36 

41 

45 

49 

54 

95 

8.0 

13-6 

— 

16 

20 

24 

28 

32 

36 

40 

44 

48 

94 

70 

12-7 

— 

14 

17 

21 

24 

28 

32 

35 

39 

42 

91 

60 

11-7 

— 

9 

12 

16 

19 

23 

27 

30 

34 

37 

87 

50 

10-6 

— 

3 

6 

10 

13 

17 

21 

24 

28 

31 

80 

40 

9-5 

— 

0 

2 

5 

8 

11 

14 

17 

20 

23 

72 

30 

8-4 

— 

0 

0 

0 

0 

3 

6 

9 

12 

15 

40 







Letter Grades 








E 


D 


C 


B 


A 



Note — (A ll numbers refer to lower limits of class intervals) 

(Greene, 1937 By permission of The Psychological Corporation.) 


instance, if a boy in the sixth grade received a score of 34, his centile 
rank would be 93, and it further appears that his score is nearly as 
high as that of the average ninth grade student. This table was 
prepared by securing scores for fairly large groups in each grade and 
then finding the scores which corresponded to each of the centiles. 
Such tables are now available for a large number of standard tests. 

Another fairly common way of indicating a person’s position in a 
group is to say in which quarter of the group his score falls The 
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dividing lines between quarters are called qxiartiles, and these are the 
25th, 50th, and 75th centiles. 

In some classifications letter systems are used for grades. In these 
a definite proportion of a group is sometimes arbitrarily assigned a 
particular letter. Illustration 124 shows three letter gradings that are 
in use From this it appears that m the United States Army examina- 
tions a person was given a grade of A if he fell in the highest 4 per cent 
of the examinees. On the Strong Interest Blank a rating of A is as- 

ILLUS 124 THE ASSIGNMENT OF LETTER RATINGS TO PROPORTIONS 



OF A 

1 

GROUP 

2 

3 

A 

4 09% 

'me 

17% 

B 

8 82 

25 

25 

C-j- 

16 69 

0 


c 

26 78 

0 

33 

c— 

21 so 

0 


D 

11 38 

0 

18 

E 

7 38 

0 

7 


1. United States >Vrm>, June 1918 Memoir National Academy Seiences 1921, XV, 
p 421 

2 Strong’s In iciest Blank (1911) 

3. A distribiiLiun of giades Ironi an elemental) ps>cholog) class 

signed to a person when his score ialls among the highest 75 per cent 
of scores of an occupanonal gioiip On a distribution ol giadcs from 
an elementary psychology class, a rating of A was assigned to the 
highest 17 per cent of the group The usefulness of letter ratings is 
limited, therefore, by the \aiiation in then meanings. 

DIMENSIONS OF A GROUP OF SCORES 

Frequency tables and then conesponding curves have three as- 
pects or dimensions centi al tendency, dispersion of scoies, and shape. 
A central tendency is a single scoie near the center of a group which 
may be used to represent the standing of the whole group Disfjet- 
sions show the lange of scores found m vaiious jDortions of a gioup. 
The shape of a time indicates us symmetry, or skewness, and its ir- 
regularities These dimensions can be desciibed by numerical in- 
dicators which aie useful for compaiing groups, and persons within 
groups. 

Central Tendencies 

The three common measures of the central tendency of a group 
are: the aiithmetic mean, the mode, and the median, all ol wduch 
are called averages. 
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Mean, In this book the mean refers to the arithmetic mean un- 
less otherwise indicated. It represents a point of balance which 
would be found if all scores in a group were assigned the same weight 
and then arranged along a horizontal beam according to class inter- 
vals. The histogram in Ulus. 120 is a picture of such an arrangement. 
If this histogram were a pile of bricks on a straight beam,^ then it 
would be possible to find a point of balance at which the bricks on 
one side of the point would equal in weight the bricks on the other 
side. This concept of balance underlies many treatments of psycho- 
logical material, and implies two basic assumptions: (1) that the 
psychological scale is comparable to a linear scale, such as distance 
in a straight line; (2) that members of a group can be considered to 
be a number of equal, unrelated weights, that is, their measured 
qualities are unaffected by their position Both of these assumptions 
have been challenged with regard to special situations, but they seem 
to be appropriate m many test situations, and the mean is probably 
the most commonly used central tendency. 

One way to find a mean is to add the scores of all the persons in 
the group and to divide this total by the number of persons in the 
group. By using this method the mean for the scores in Ulus 118 is 
found to be 47.96 If a large number of persons are to be measured, 
an adding machine will save a great deal of time. 

A short-cut in finding a mean is first to guess at the probable mean 
score, and then make the necessary corrections In Ulus. 125 the 
guessed mean is arbitrarily taken as the midpoint of the class interval, 
45-49.9, which seems from inspection to be nearest the middle of 
the distribution. The deviations, or number of steps, of all class 
intervals from this class interval are placed in the third column, and 
the frequency of class interval is multiplied by its respective devia- 
tion, as in the fourth column. The deviations above the guessed 
mean are marked plus, and those below, minus. The total minus 
deviations are subtracted from the plus deviations (90 — 69 = +21) 
to find the amount and direction of the correction for guessing. The 
result, +21 deviations, divided by the total number of cases, is the 
mean correction. This amount is multiplied by the size of the class 
interval to make its units of the same denomination as those of the 
raw score Finally, this correction is added to the guessed mean, 47. 
The result is 48.05, which is taken as the mean of the distribution. 
Illustration 125 also gives a general equation with letters substituted 
for the words and phrases used here Such equations are a great con- 
venience, for the letters or algebraic symbols are more quickly writ- 

8 To make a better illustration, the beam itself should have no weight at all 
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ten and read than the words they represent Since these symbols aie 
widely used, it is advisable to learn them. 


ILLUS 125 C. \1 CULATION OF MEW AN'D ST\NI1\RD m'MVTIOS’ 
or SCORES IX ILLUS 118 
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a Class Interval 

Mode. The 7node of a group is defined as that score or class inter- 
val which has the largest frequency It is found by making a fre- 
quency distribution and inspecting ii Illustrations 120 and 125 show 
one mode which is *it 47, the midpoint of the class inter\al with the 
largest frequency, 18 persons 

In some cases a fiequency curve shows two or more modes Tw^o 
class intervals wrhich have laige frequencies may be separated by 
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class intervals having smaller frequencies, as'in Ulus. 126. Sometimes 
these bimodal curves result from combining data from what would 
otherwise form two separate unimodal curves, as is apparently the 
case in Ulus. 126. Occasionally bimodal curves are the result of poor 
measurement techniques. As bimodal curves are rather rare in the 
measurement of human skills under normal conditions, they need 
to be scrutinized carefully, and the data on which they are based 
should be analyzed to find the cause of abnormality. 

ILLUS 126 BIMODAL CURVE, CRITICAL SCORE 

No. of 
Workers 



Seconds Needed for Niiinber-Checking Test 
(From Link, 1918 Reproduced by permission of the Macmillan Co ) 


Median, The median is defined as the middle score (Pg^) in a 
group when the scores are arranged according to size. The median is 
found by an interpolation of the scores in the class interval which 
contains the middle score. If we take the middle score as that of the 
fiftieth person in a group of one hundred, or the 50th centile, then 
in the fifth column of Ulus. 112, it appears that this centile falls m 
the class interval 46-49 9. Thirty-eight persons are below this class 
interval; therefore the twelfth person from the bottom of this class 
interval is the fiftieth person m the group His approximate score can 
be calculated by interpolation when it is assumed that the eighteen 
scores in this class interval are evenly distributed over the whole 
interval. When the interval is small compared with the number of 
cases falling in this interval, this assumption does not introduce an 
appreciable error. From this assumption size of the class 

interval ^5) must be added to its lower limit (44.5) to secure the score 
of the fiftieth person In this instance the median is 47.8. (Median = 
44.5 + (i%g)5= 47.8). 
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The Uses of Central Tendencies The question oiten arises: 
which ccniial tendency is the best^ 7'he answei depends upon the 
distiibiiiion ol the scores and then intended use In a noinial curve 
the thice ccntial tciulencies aie identical, but in an irregular or a 
ske-iscd curve they ^mII diflei (Tlliis 127) A curve is said to be skewed 
from the iioimal when il is not symmetneal 

Let us considci a situation m which the most representative scoie 
of a group IS dcsnable Of the three the mean is most aftccted by the 
extreme scores oi a group, so that when the cxtieme scoie^ are 
thought to be lather insignificant, the median or mode is used. The 
mean ie]>iesents a tiuc centci ol balance in a gioiip, and it is gen- 
erall) prcieircd wlien the scoies lepiesent precise nieasiireinenis The 
mode indicates the most Irequcnt scoie in a group, and it also shows 
when bimodality occuis 


ILUJS 127 SKEWNFSS 



In addition to using a central tendency as the most leprescnta- 
tive score of a group, it is also used in calculating vaiious inteiesting 
ratios wdiich aie described later in this book. 

Dispersions 

I’he amount of dispersion exhibited by a group is usually revealed 
by one ol foiii indicators the total range, the qua? tile deviation, the 
probable error, or the standard deviation.* 

Total Range The total range of scores is the ditterciice between 
the highest and low-^est scores m a group Its use iii securing class m- 
terv als has already been described It is not, however, used comnionly 
for comparing the dispersions ol two groups of peisons, because its 
size depends upon the tw’o extreme scores. These two scoies often 

* Another indication of dispcision which is also used occasionalh is called the 
avei#ige deviaiion, \D, oi mean deviation, MO, or mean variation, MV It is 
found I)) summing all deviations trom the mean legaidless of sign, and dividing 
this sum by the number of scores In a normal curve the .^D is one half the langc 
of the middle 57 5 per cent of cases 
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vary considerably in several chance samplirigs, particularly in small 
groups, so that other indications of dispersion which are less affected 
by chance are more useful. 

Quartile Deviation and Probable Error, Both the quartile devia- 
tion (Q) and the probable error (PE) are found by taking one half 
the range of the middle 50 per cent of the cases in a group. The Q 
is used for actual measures, and the PE is applied to hypothetical 
situations where the probable variation which would occur through 
chance is desired for purposes of prediction. Q is found by subtract- 
ing the 25th centile from the 75th centile, and dividing the re- 
mainder by 2. 


Q = 


Pts ^ P 25 

2 


The calculation of Q is shown in Ulus. 119. 

The probable error (PE) is usually calculated by multiplying the 
standard deviation by 6745. The figure is the ratio of the PE to the 
SD of all normal curves. Both the standard deviation and PE are 
used only for normal distribution curves. 

Standard Deviation If one should drop perpendiculars from the 
two inflection points of a normal curve of distribution to the base 
line, the total area thus bounded would include 68.27 per cent, or 
roughly two thirds of the whole area. The distance on the base line 
from the mean to the foot of such a perpendicular is known as a 
standard deviation. Its usual symbols are a small sigma (or) or the 
initials SD. The SD, used as a unit of measurement, can be laid out 
along the base line a number of times, as in Ulus 121. The normal 
curve extends approximately 2% SD in each direction from the 
mean, and the total length of the base line needed to include 98.76 
per cent of the cases is 5 SD The percentage of the area which lies 
above each hundredth of an SD has been calculated to twelve deci- 
mal places. A standard deviation has therefore become a precise 
indicator of position in a group. The SD has another advantage: it 
produces a scale whose steps are equal in the sense that they represent 
equally often-noticed differences. Theoretically, the difference be- 
tween scores of persons who are 5 and .6 SD above the mean can be 
noticed by a competent group of judges just as often as a difference 
of .1 can be noticed anywhere in the scale. This provides a method 
of scaling scores of psychological phenomena, which is comparable 
with the best physical scales, since all scales are based on the ability 
of competent judges to notice certain differences equally often. 

The Shape of a Curve, Before calculating the standard deviation 
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for an) group, one iiiubt be rcasoiial)Iy sure that his figures fit a nor- 
mal tlisiribution curve iaiil\ ^vcll Tests lor goodness ol fit include 
tliose lor skcuncss (Sk) and kuriosis (Ku) A cur\c is said to have 
skewness when ii is not s)innictrical KiDtoi^is is the lelatise fiatiicss 
oJ a nave A cuue \aries from normal kuitosis \\Iieri it is i datively 
more peaked, lejjto/nntjc, oi moic llattened, plafykuUic, than the 
noi inal curve 

Skewness can be loughly measiiied by the loiniula Sk = 
P P 

_Jo ^ — L2 — (Gfuiert, 1937, p 299) When Sk is zeio the curve 

is svmmeLiical When Sk is positive the ciir\c extends laithei to the 
right than to the lelt ol the mean (show n as A in Ulus 1 27). AX hen Sk 
is negative, the curve will icsemble B in Ulus 127 Fioin Ulus 119 
the skewness ol the disLiibnrion is loiind to be 60. 

The kurtosii ol a curve can be roughly louiid by the for inula 

K.U ^ -—(Garrett 1937, p 230) This I orinu la gives a value 

'»() ^ lo) 

oi 263 for ihe noiinal cuive The distribution in Ulus 119 shows a 
kuriosis oi 275 Ihe amount b) which a ciuve inav deviate Iroiii the 
noiinal shape in eithei skewmess oi kuirosis by chance eriois oi meas- 
urement can be known (rom ioimulas in Garrett or other statistical 
texts Small variations irom the noiinal curve iritioducc insignificant 
eri'ors in practical compan^ons 

Calculation oj a Standaid Deviation Since the curve in Ulus 120 
has show'n nearlv iioinra! values ioi both skewness and kurto^is, the 
formulas vshich have been designed lor use with a normal riirvc may 
be salely applied llie calculation of the standard deviation for the 
scoies in Ulus 120 is show'n m Ulus. 125 Tt is found by the following 
steps* 

1 'Ihe deviations from the guessed mean are squared and added 
in the filth column 

2. "lire sum ol the squared deviations is divided by the number of 
cases This result is the mean ol the sqiiai'ed deviations 

3 'Ihe mean of ihe squared deviations is corrected lor the error 
caused by the use oi the guessed mean by subtracting the scpiare of 
the mean deviation Jf the deviations had been measuied from the 
true mean, this step would not ha\e been nccessarv, but it is nearly 
alw’a)s quickei to use a guessed mean and the correction than to find 
the correct mean and then nieasiirc deviations from it 

^ The square rooi of the coiTectcd mc<in ol the squared deviations 
is ioLind and multiplied by tlie class inteiwal This is the SD 
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ILLUS 128. STANDARD ERROR O'F THE MEAN 



LEVEL OF SIGNIFICANCE 


The discussion in the preceding sections indicates that these meas- 
ures of a group of scores are likely to vary somewhat if they are re- 
peated Sometimes it is important to know how much they are likely 
to vary for a represen tati\e group. This can be determined in two 
ways: one is by actual repetitions of measures of the same group, 
and the other by mathematical estimates. Making actual trials is, of 
course, the best method, but, since it is extremely expensive and time 
consuming, the statistical method is generally used It is assumed 
that the more persons in the group, the more accurate or representa- 
tive will be the results. Also, it is assumed that if the group -were 
actually measured many times, the means would fall into a normal 
distribution, whose standard deviation is called the standard error 


of the mean (o-jn)- The formula for which is o-m : 


VN-1 


Thus if a group of 570 pilot candidates have a mean of 142 and a 


<r of 10, then ~ = .42. (See Ulus. 128.) 

We can say how much the mean is likely to vary in this group 
because we know the proportions of the normal curve which cor- 
respond to standard deviations. These are given in Ulus. 121, and 
in greater detail in Ulus 129. For instance, one standard deviation 
below the mean corresponds to a centile of 15 87 The chances are 
therefore about 16 in 100 that the mean pilot score would on many 
repetitions fall below 141.58 (142 — .42) Since two standard devia- 
tions below the mean correspond to a centile of 2.28, the probability 
is only a little more than 2 in 100 that the mean will fall below 
141.16, which IS two standard deviations below the mean actually 
found (142 — 84). 

The actual variation in score ( 84) is called a fiduciary limit and 
corresponds to plus or minus two standard deviations. Fiduciary 
limits can be found for any desired level of significance It can be 
said with great certainty that the mean will not fall by cliance more 
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ILLUS 129 CORRE^^^ OF STANDARD DIVIXIIONS, 

T SCORIS, CIlN AND ORDIX M L55 Ol THL NORMAL 
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1 Proportion o[ total area below the point indicated b> the siandaid devi.ition. 

2 Relative height of cuT\e at the point indicated by the standaid deviation 

(^nanged from Pearson, 1914, by permission of the cditoi of Hiomeliika) 


than three standard errors below or above what was actually found 
The reliabilities of measures of dispersion are likewise calculated 
by assuming that the chance forces which cause them to vai-y are 
related to the size of tlie gioup Thus the standard erroi of a 

a =-^. 

V^N 

These probabilities arc often referred to as leuels of significance 
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or levels of confidence. The one-per-cent level of significance simply 
means that the probability is not more than 1 in 100 that a given 
variation above or below an obtained measure will happen by 
chance. A 5-per-cent level of significance means that similar chances 
are only 5 in 100 For large groups, for example five hundred cases, 
the one-per-cent le\el of significance of a difference between an ob- 
tained mean and a given score would be a point 2.58 standard devia- 
tions below or above the obtained mean, and the 5-per-cent level 
would be a point 1 97 standard deviations from tlie mean (Ulus 129) 
For small groups of from 25 to 50, these numbers are slightly larger, 
2.179 and 2 06 respectively For small samples (less than 25) the 
centiles in Ulus. 129 are too small at the extremes. R A. Fisher (1925) 
computed tables for such probabilities which he called t (Illus, 130). 
Thus a logical development of this consideration of levels of signifi- 
cance is the null hypothesis — a statement that the observed differ- 
ences in certain situations probably do not represent real differences, 
but can be explained purely by chance, i e., random variations in 
sampling. 

The probability that a certain variation or difference will occur by 
chance is known from many careful measures of situations, such as 
the chance distribution of heads when coins are tossed, or of cards 
when random selections are made, or of many other permutations 
and combinations. The mathematical study of probabilities has 
yielded accurate formulae for predictions of chance. When applied 
to the difference between means (Chapter XIII), the null hypothesis 
states that there is probably no real difference between the true means 
of the two samples if the difference divided by its own sigma (D /a) is 
less than 2.58, because such a difference would occur by pure chance 
about 1 m 100 times 

When applied to a mean, the null hypothesis states that there is 
probably no chance that the true mean varies away from the observed 
mean more than three times its own standard error, since 2 68 SE 
will include all except the one half of one per cent at each extreme. 

USES OF MEASURES OF DISPERSION 

Two common uses of measures of dispersion, the comparison of 
groups and the comparison of individuals in a group, are described 
below. 

Group Comparisons 

Often an investigator wants to indicate which of two groups has 
a larger dispersion of scores on a test. For instance, in employment or 
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ILLUS 130 DISTRIBUTION OF T RATIOS FOR VARIOUS DEGREES OF 
FREEDOM; SIGNIFICANT AT 5-PER-CENT AND 1-PER-CENT LEVELS 
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* Degrees of freedom are clcrmcd as the njiiiihei of pcisons in ihc gioiip being 
measured times the miniher of tests of ^anablcs It onl\ one test is used the flcgiccs 
of freedom equal N, the total iiimibei of poisons 
Note This table is lead as iollous Whoic tlieie is one degree of ficcdoin the 
difference between the mean aiifl the oiliei score undei consideiaiion must be 
12 706 times the sigma in order lo be signilicant at ilic i-pei-ceni level, or when 
there are 20 degrees of freedom a siinilai dilleiencc iiuist be 2 08h times the sigma 
m order to be significant at the )-pei cent level, and 2 810 times the sigma in older 
to be significant at the 1 -pci -cent level 

(Adapted from Fisher and Yates Slnlistical Tables for Biological, AgriruUural, 
and Medical Reseaifh, Olivei & Bovd, Ltd , 1 dinhuigh, bv pcimission of the 
aiithois and publisheis ) 


educational work it is iiupoKani for admrnisirarors to know the 
range of abilities of the persons in various groups with wdiom they 
are dealing. When the groups have a iioiinal ioim oi distiibution, 
either the quartile de\ lation or the standard deviaiion is a convenient 
measure of dispersion When gioups of 50 or less are compared, the 
Q is used because it is less a fleeted by cxtieiiic scoies than is the SD. 
Illustration 131 shows two groups oi service ratings winch are com- 
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Efficiency Score 

Both supervisors gave the same median score for the 
group, Miss A assigned no very high or low 

grades, hence their quartile range (q) is 3,1. Miss B 
used a much wider range of scores so that their Q is 


pared in this fashion. In order to have these ratings comparable the 
ranges for each supervisor must be nearly the same, or be made the 
same by assigning standard scores or centiles to the ratings. 

Individual Comparisons 

A frequent use of measures of deviation is the indication of a per- 
son’s position in a known group of persons. A person’s raw score may 
be changed to a standard scoie, sometimes indicated by z, by sub- 
tracung the mean (M) of the group from the person’s score (X), and 
dividing the remainder by the standard deviation (o-) of the group 
Thus, if John scored 39 (Illus. 118), he would have a standard score 
of —.92, since his score would be — 92 standard deviations below 
the mean. 


z = standard score = = ^ = — 918 or —.92 

O’ 9 8 9 8 

In order to eliminate decimal points and minus signs which are 
somewhat troublesome, standard scores are often changed to T 
scores. Standard deviations and equivalent T scores are shown in 
Illus. 121 and Illus. 129. In T scores the mean of a group is placed 
arbitrarily at 50, and one standard deviation is given the value of 10. 
Thus, John, whose raw score is 39, would have a T score of 40.1, 

T ,core=ei^+60=i2?^ + 50=-.92 + 60 = 40.8 


T Scores Versus Centiles 


T scores have an advantage not enjoyed by centiles in that scores 
may be added and subtracted without fear of introducing errors due 
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to inequality of scaling/ This statement can be understood when 
one looks at a normal frequency curve (Ulus. 121). It is evident in this 
illustration that the distance between the 50th and 69th centiles on 
the base line is the same as the distance between the 93rd and 97 th 
centiles. This is due to the fact that in a normal curve the frequency 
of scores is greater toward the center of the distribution If a person’s 
scores on two tests are averaged, using centiles, an error is introduced 
For example, if Frank made T scores of 50 and 70 on two tests, the 

mean of his T scores is 60 ^ . These T scores correspond to 

the 50th and 97,72 centiles m Ulus. 129. The mean of these centiles is 


which corresponds to a T score of only 56.4 in- 


stead of the correct standard score of 60 The use of centiles has intro- 


duced an error in this case of 3.6 points Such errors are usually 
Ignored because the original scores are rather rough and errors seldom 
change one’s position in a group. 


Profiles 

A person’s ability can be portrayed by recording graphically his 
scores on several tests that are considered to be important and some- 
what independent Such a graphic recoid, called a profile, is con- 
structed by plotting a peison’s scoics on a profile chart Such a chart 
is made by laying o/T on graph paper a line u’hich represents the mean 
of a group and other parallel lines ^\lnch represent units of disper- 
sion, such as standaid scores A profile is one of the most interesting 
and valuable ways to represent results oi tests, lor it gnes a person 
a clear picture oi his strong and weak subjects For the greatest 
convenience ihe raw’ scores are placed at the proper intervals lor 
each test in order that the person nia\ find his position in a group by 
simply placing dots beneath the numbers which represent his scores 
Lines connecting these dots help to make a quick comparison In 
Ulus. 132 mean-score profiles are shown ior foiii occupational groups 
on eight different tests based on T scores of a large adult population. 

Similar profiles may be constructed Iroiii age scales in which the 
scores m each test have been gi\en chronological age equnalents. 
Thus, Ulus. 133 shows a profile from a Stanford Achievement Test. 
This profile has the ad\antage of showing scores lor a\eiage age 
groups and grades in school It also has the advantage of being com- 
posed of steps w’hich arc probably as equivalent as the steps in the 
standard deviation scales, because age diflerences become smaller 
with advancing age. 
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To Combine a Personas Score on Several Tests 

Occasionally one wishes to have a single score which will represent 
a person's ability on several different tests. The selection of the best 
way to combine scores depends upon what use is to be made of the 
combination. One way to combine test results is to average a person's 
T scores Where few tests are involved the median is preferred since 
it is less sensitive to extreme scores than is the mean 

ILLUS 132 OCCUPATIONAL PROFILES FROM STANDARD TESTS 

standard scores 

Tests 36 SB 40 42 44 4S 48 so 52 54 Sb 58 60 62 
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(Adapted from Dvoiak, 1935 By permission of the University of Minnesota Press.) 
Weighted Scores 

In some instances it is considered desirable to give more weight to 
the scores in one test than to those in another. Methods of weighting 
the scores are of two kinds. One method gives more credit for a harder 
task or a task that is considered more important than another Doing 
this is generally a rather arbitrary matter. For example, in many ex- 
aminations the number of items correct on an arithmetic test is 
multiplied by some weight such as 4, whereas the scores on several 
other tests may be simply the number of items correct Another 
method uses a statistical analysis which shows how well various scores 
predict a certain criterion. In order to make the best prediction, 
scores are given various weights. Methods of combining scores m a 
way to make the best possible prediction of a particular criterion are 
to be found in standard statistical texts. 

An able discussion of the effects of weighting various standard 
scores upon their combined results is given by Guilford (1942), Mc- 
Nemar (1949), and Garrett (1947), who point out that the differences 
between various weightings depend upon (1) the number of scores 
which are combined, (2) the uniqueness of the various scores, (3) the 
shape of the distribution curve of each score, and (4) wheAer the 
weights are constant or variable. 
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IIl.TTS ns I’ROni E, ST WrORD ACHIEVFMEXI TEST 


EDUCATION\L PROMLE Cl^^RT NEW STXSFORD AfHirVEMENT TESF, AD\ASTLD F\.\M!N\T10V 



*rni1r drrinrd m iii FkMe t r( ihr Ikmutnujaf Aim hi imug 
**F<diiciiKiril AfH iljovr I*- « pomt iir ri ■■p^lrr viljes 
Sm Cmitjf /■Icr/miiif for eipiBration ol venical I -cs. 


(Kcllc), Riuh, and Tcrraan, 1911 B) pcimission of the 'World Book Co.) 
STUDY GUIDE QUESTIONS 

1 Define concjseh scale, frequency tabic range of scores, class inten'al, 
histogram frc<iiicnc) cm \c, normal curve ogive curve, rank centile quartile 

2 Define mean, median and mode What advantages has eacli> 

3. Hovv’ does the use of a guessed mean save time in ralculating the true 
mean? 

4. For what do the following symbols stand M, Md, G M , N, X, xi, i, 
SD, <r, Pjo, z, T score? 
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5. How can letter grades be given a similar me'aning in various situations? 

6. How can cen tiles or T scores be found from raw scoies? 

7. What is the relation between Q and SD in a normal curve? Which is 
preferred for small groups of persons? 

• 8. What is meant by the reliability of a mean? 

9. What are the advantages of a profile of scores over a single total score? 



CHAPTER XIII 


MEASURES OF 
RELATIONSHIP 


This chapter indicates the principal ways of finding and indicating 
the relationships between measuies Such jnlormauon is uselul in 
selecting items for rests, iii picdicting iiuure success from present 
scores, and in iheoretiCtil analyses o( elements ol behavior First, 
scattergrams and bai charts aic cliscussetl, then corielation coefficients 
are illustrated Lastlv, ceiinm errois which make conelations too 
high or too low aie piesented 

PREDICTIONS OF PROBABLE SCORES 
FROJM KNOWN SCORES 

To a great extent psychological measinement is intended to render 
an accurate prediction ol the probable quality and quantity of an 
individual's development Piedictions aic based upon the assump- 
tion that an individual w'lll develop as persons known to be similar 
to him have developed in the past They may lake the torni ol a state- 
ment of probabiht) that a person w'lll have a certain score in one 
variable when he has a paiticnlar score m another The technique 
for ascertaining such probabilities is iclalively simple Much credit 
for developing tins lecliniquc goes lo Sir Francis Galton (1886) in 
connection with his studies ol the inheiitance of genius Starting w ith 
sheets of graph paper and some pins, he devised a method of measur- 
ing relationships among variables 

Scatter Diagrams 

In finding relationships between paicnis and children, Galton 
arranged his data on giajih paper, as shown in Ulus 123, so that the 
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heights of adult children were indicated on the base line, the X scale, 
and the heights of midparents (average of the two parents) on a 
vertical line, the Y scale. For each parent-child pair a pin was placed 
to show the scores of both the midparent and the child. Thus, a mid- 
parent’s score of 71 and a child’s score of 64 would be represented by 
a pin in square A, which lies in the column containing the child’s 
score and the row containing the adult’s score Such charts, called 
scatter diagrams, scatter gi ams^ or double entry tables, allow one to 
see at a glance the mam aspects of relationship. For instance, in 
Ulus. 134 It appears that there is a marked relationship between the 
heights of parents and the heights of their children, but some pairs 
in the upper right and lower left portions show marked deviations 
from the general rule. There is also a marked tendency for children 
to be a little nearer the average height of the group than their par- 
ents, and vice versa. 


ILLUS 134 SCATTER DIAGRAM WITH PINS 



Bar Charts 

For small groups, and on relatively unstandardized material, 
simple scattergrams or bar charts show trends effectively. Illustration 
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135 shows the results of 'comparing scores on a selection test with 
ratings of success in a clerical position The following steps are to be 
taken in making such a chart' 

a. Make a scattergram with the X scale (horizontal) showing the 
preliminary score and the Y scale (vertical) showing the criterion 

h. Determine the level of success which is desired in this particular 
situation. (This is usually the work of the supervisory or manage- 
ment staff.) 

c Draw a horizontal line on the chart dividing the group into two 
subgroups, the high group will include those who attained the de- 
sired level of success, and the low group those that did not. (In this 
case the line is drawn at 4 ) 

d. Draw a vertical line so that a large proportion of the satisfactory 
workers will fall to the right of it and a large proportion of the un- 
satisfactory to the left. This line, placed at 30, is tentatively called 
the critical score. The critical score may, of course, be shifted to the 
right or to the left as the needs of the organization or the caliber of 
applicants for work change. 

e Compute the percentage of each subgroup that falls to the 
right and to the left of the line. Tl a critical score oL 31 were used, 
this would indicate that 75 per cent ol the high group and only 20 
per cent of the low group would be included, while 25 per cent of the 
high group and 80 pci cent of the Joiv group would be excluded. 

/ Draw a bai chan wdiich shows (or each siiligroup the j^eiccntagc 
of persons to the right and to the left of the critic al score. 

THUS IS-) SCAT^LRGRV^^ WD B VR CHVRT 



It may be desirable to have ihrec subgroups high, middle, and 
low^ Man) \aiiations of the bai graph occur All have the advantage 
of showing graphically what the cflects of using a 232 trticular ciit-ofi 
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ILLUS. 136 CORRELATION TABLE 
Adult Children i = 1 inch 
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Answers 

Formula 1 Mean X = Guessed Mean + ^ i = 68 + (-h 105)1 = 68 11 

Formula 2. Mean Y = Guessed Mean -f = 68 + (— 199)1 = 67 80 


Formula 3 
Formula 4. 

Formula 5. 

Formula 6 
Formula 7. 


= V'33B~-.039 = Vm = 1 


SDr = 


1 

(Ex' 

M.) 


N \ 

^ N 

N J 

_211 

SDx 

•SDy 



2.09 




463 




II 

1 

1 — 

2034. 

_ 7966 


82 


.105) (— 198) 


255 182 


\/N 




30 46 


2 55» 
1.82* 


.451 

.026 


Kr = V 1 — rz = 


.898 


• Must be multiplied by i if SDx and SDy are desired. 

(From data in Gallon, 1886 ) 
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score would probably be Another equally good method is to use 
distribution ciiivcs for the groups to be compaicd, as shown in Jlliis. 
126 

Regicssioii Lines 

'VV'hen larger groups or more piecisc measures are in\ol\cd, bar 
charts arc often ver) uselul, but moie detailed indices ol lelation 
ma) also be desiicd For example in order to predict the most prob- 
able height ol a child fiom midpaients ol a particular height, say G9 
inches, Galtori assumed that the mo^t liequent height would be the 
same as the median height of childien in the paincular row ol 69- 
inch parents A glance at Ulus 136 shows this median height to be 
a trifle less than 69 inches 

The general relationship behveen parents' and children’s heights 
will be indicated by a line diawm through the medians ol the veitical 
rows, which is called a •iegies^ioji line of child) en on parents A simi- 
lar line drawn thiough the midpoints of the horizontal columns 
is called the ieg)ession line of parents on children When both of 
these lintb happen to be straiglit, one of them alone is enough to 
show both relationships lllustiation 136 show's legression lines that 
are slightlv curved and broken Straight lines could be drawn in 
such a way, how'e\er, that they pass near the median points Since 
this IS the case, the legiession lines can be treated as if they are 
straight.’’ 

Upon inspection of Ulus 136 it w'lll be seen that all niidparents 
who were in the class interval labeled 70 had childicn whose median 
height was lower — 69 niches Short midpaieuts in class interval 63 
had children w'hosc median height was higher— 65 inches From 
such measures Galton generalized his lamous law" of filial regression, 
namely, that childicn tend to be more like the central tenclcncy ol 
the group than do their parents 

The tenii legiession line is now applied to all lines which indicate 
the relation betw'een the class intervals on one scale and the corre- 
sponding mean or median scores on another scale ^V^hen the regres- 
sion line is straight, this relation can be found and gi\en a quantita- 
tive value for the group as a whole by measuiiiig its slope from the 
base line 

1 Ihc line o£ best fit, that is, tlic line which has the smallest total deviation from 
all the dots, can be calculaicd Ils slope is given Iiy ihe product-moment coirclation 
coefficient described m the next section 
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Slope of Regression Lines; Correlation Coefl&cients 

The slope of a straight regression line from the base line can be 
indicated by the ratio between the distances from the horizontal and 
vertical scales or axes, and any point except zero on the regression 
line. Thus, in Ulus. 137, Diagram 3, the point P is 3 units from the 
horizontal axis or base line, and 5 units from the vertical axis The 
ratio between these two lines {% = 60) will be the same as the ratio 
of two similar lines measured from another point on the same re- 
gression line. 

The slope (in this case .60) is known in trigonometry as the tangent 
of the angle between the regression line and the base. Since tangents 
have been calculated carefully for many angles, the tangent is a con- 
venient figure to use for the slope of a line When the horizontal and 
vertical scales are changed to standard deviation units, variations in 
raw scores are eliminated from a scatter diagram. In this case the 
tangent of the regression angle is the correlation coefficient. The 
correlation coefficient thus shows the slope of a regression line when 
the two tests are scaled alike. For a 45‘degree angle the tangent and 
the correlation coefficient are 1 00 If the regression angle approaches 
zero, the correlation coefficient approaches zero A zero indicates that 
the relationship between the variables compared is merely chance. 

A chance relationship means that a person who makes a particular 
score in one test has the same probability as anyone else in the group 
of making any one of the scores in the other test. Illustration 137 
shows regression lines and correlation coefficients for six typical situa- 
tions. Diagrams 1 to 4 illustrate elliptical or circular boundaries and 
regular distributions which probably indicate that both variables are 
normally distributed. If the tally marks show large gaps, as in Dia- 
gram 5, It is clear that the variables are irregularly distributed. If 
the marks show a fan-shaped distribution, as in Diagram 6, the corre- 
spondence of scores between the two variables is evidently much 
closer at the lower end than at the higher end of the distribution. 

Product-Moment Correlation Coefficients 

Since the process of making scatter diagrams and drawing regres- 
sion lines is laborious, shorter methods have been devised for cal- 
culating relationships. One of the best and most commonly used, the 
product-moment correlation method developed by Pearson (1904), 
is designed for use when both tests show fairly normal distributions, 
when the regression lines are approximately straight, and when 
large numbers of individuals are involved. 

The product-moment correlation method makes use of the fact 
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ILLUS 137 1 YI'ICVL SCATTER DIVGRV-NfS .\\D REfJRTSSlON LINES 







386 


ELEMENTARY STATISTICS 


that if all persons have exactly the same 'relative position on two 
tests the correlation is perfect. But the more they differ in relative 
position, the more the correlation is lowered. Since relative position 
is well indicated by SD scores, one may proceed to find the difference 
(D) between the two SD scores for each person, taking account of 
their signs. The following formula will then yield a product-moment 

correlation (r) when N is the number of individuals, r — 1 —" 21 ^’ 

Product-moment correlations are usually found, however, as shown 
in Ulus, 136, without the labor of computing SD scores. The steps 
are as follows: ^ 

1. A scatter diagram is constructed. (This is not necessary for cer- 
tain machine methods ) 

2. A guessed mean is located, and deviations x' and y' are indicated 
for each variable respectively, and added. These sums are used to 
correct for guessing. 

3. The frequency (f) of each row and column is found. 

4. The frequencies are multiplied by the deviations squared, and 
substituted in formulas 3 and 4 to secure the standard deviations. 

6. The frequency of each cell is multiplied by its x'y' value, which 
is the product of the x' and the y deviation for each cell. The x'y' 
values are added in the two righthand columns and substituted in 
formula 5 to secure the correlation coefficient 

The availability of machines which will add, subtract, multiply, 
and divide makes it possible to compute correlations fairly rapidly 
without the use of scatter diagrams or guessed means The usual 
formula is: 

NSXY^SXSY 

““ VNSX2 (5X)^ VN5Y2 — (SY)2 

The advantage of using this procedure is that raw scores, their 
squares, or their products can be added quickly and accurately. One 
serious disadvantage is that the valuable information about the pat- 
tern of coincidence which appears in a scatter diagram is lost. This 
formula can, of course, be used m hand computation, but the formula 
in Ulus. 136 serves to keep the numbers small and to eliminate 
insignificant figures. 


SHORT METHODS 

The Pearson Correlation Coefficient just described is recognized 
as the standard for the most careful work. The labor involved is 

2 Formulas 1 and 2 are used to secure the means and are not essential to the 
product-moment method. 
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usually great, however, eVen when the best machines available are 
used Shorter methods are sufficiently adequate: (1) when very small 
samples are measured, and one or two extreme scores would bear 
a large share in determining the coefficient, (2) when one or both 
variables yield only two scores, such as true or false, pass or fail, and 
(3) when large samples are measured, so that each part of each dis- 
tribution IS well represented Three short methods— the rank order, 
the tetrachoric, and the biseriai correlation methods — ^will be de- 
scribed. 

Rank-Order Correlation Coefficients 

When a group is small, correlation coefficients are often secured 
by the rank-order method Illustration 138 shows how a correlation 
of .564 was found between clerical test scores and a supervisor’s rank- 

ILLUS 138 R'\NK-ORDER CORRELATION METHOD 



ing of ten persons First, the scores are changed to ranks, then the 
diffeienccs (1)) between each person’s two lanks are found and 
squared Lastl), the sum oJ the squared diflercnces is substituted in 
the formula 


Rho = p = 1 — 


6 

n(n’ - 1) 


Rho (p) will be 1 00 and thus show a perfect correlation it the D’s 
are all zeio. Rho will be 00 it the lanks have only a chance relation- 
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ship, and — 1.00 when there is a perfect negative relationship. Rho 
IS nearly always a little smaller than r, a product-moment correla- 
tion for the same set of figures. 

Tetrachoric Correlations 

When a large number of correlations are desired, a short method, 
called the tetrachoric method, is sometimes used. Theoretically one 
has to have about twice as many cases to get the same reliability from 
a tetrachoric as from a Pearsonian correlation, but the labor in- 
volved is much less. The procedure is as follows: 

fl. By using convenient points, 
usually somewhere near the mid- 
dle of each distribution, divide a 
scattergram into quarters (Ulus. 
159). 

6. Determine the percentages 
of the whole group that lie in 
each quarter and in each half. 

c. Insert these figures in nomo- 
graphs prepared by Cheshire, Saf- 
fir, and Thurstone (1958), or by 
Jenkins (1950). The value of the 
tetrachoric r for the data in Ulus. 
139 is .79, 

Biserial Correlations 

When one of the variables 
yields only two categories, such 
as pass and fail, a biserial corre- 
lation can be computed as shown in Ulus. 140. 

Kuder-Richardson Formula 21 

This is an indication of reliability which is widely used m prefer- 
ence to the split-half Spearman-Brown Formula. It is preferred be- 
cause reliability coefficients can be calculated in about 2 minutes 
when only the mean, the SD, and the number of items are known: 

n <^2_npq 

in which 

rtt is the correlation of a test with itself, n is the number of un- 
weighted items in the test, o- is the standard deviation of scores, and 


ILLUS 139. TETRACHORIC 
CORRELATION DATA 


rt = 79 
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ILLUS. 140 BISERIAL CORRELATION 


Score 

Fail 

Pass 

Total 

55 


3 

3 

50 

1 

7 

8 

45 

1 

5 

6 

40 

2 

7 

9 

35 

7 

4 

11 

30 

3 

2 

5 

25 

1 

1 

2 

20 

6 


6 

Total 

21 


50 

Per cent 

42% 

Biserial r = 

58% 

Mp — 

100% 


An illustration is as follows 
Where Mp is the mean of the pass column 
Mf IS the mean of the fail column 
p IS the per cent of cases in one column 
q IS the per cent of cases in the other column 
o't is the standard deviation of the whole group of scores 
y is the oidiiiate of llu* noimal ciii\e for values (Oiiesponding to 
p and q, i\hcn Llie\ aic assumed to represent aieas under the 
noinial cuive (Ulus 129) 

Substituting values foi the present pioblem 

Biserial r - 1^- — X ^ 2 x i8 
9 5 39 

_ 10 Gy- 2-136 

= 1 11 X G21 
= 692 


p is — where Ml is the mean of the scores 'I’his is the average item- 
n 

difficulty, q is (1 — p). 

Kuder-Richardson Formula 21, which gi\es \alues that are 
slightly lower than the Sj^earman-Brown estimates, assumes that all 
the items in the test ha\e the same difficulty. Although this assump- 
tion is seldom true, it usually introduces onl) a small error in the 
estimation Another formula is pro\ided lor situations wlieie a small 
number of items and laige difTercnces in tlieir difliculty make it in- 
advisable to use Formula 21. 


PREDICTIONS OF A SCORE 

It is often desiiablc to determine the probable limits of a particu- 
lar estimate They can be determined by applying correlation coef- 
ficients to the respective distiibutions as follows: 



390 


ELEMENTARY STATISTICS 


Individual Predictions 

When the score on one test is known individual predictions of 
the most probable score on another test can be made in two ways: 
from a scatter diagram, and from an equation which represents a 
regression line. 

Prediction from a scatter diagram is made by finding the most fre- 
quent score in Y which corresponds to a particular score in X. For 
example, in Ulus. 136 the most probable height of a child whose mid- 
parents are in the 63.0 class interval is 65 0 inches 

Prediction from a regression equation avoids the rather lengthy 
process of making a scatter diagram and calculating the central 
tendencies of each class interv^al. Regression equations also give a lit- 
tle better prediction since they are less affected by minor variations 
of a chance variety than are the central tendencies of class intervals. 
Regression lines smooth out minor chance deflections. In finding the 
most probable height of children (X scale) from parents who are 
63.5 inches tall (Y scale) a convenient formula to use is X — = 

rxY^(Y — My). This formula looks more complicated than it is, for 

CTY 

it is really a short statement of familiar terms, namely, 

X is the score to be predicted. 

Mx (called M sub X) is the mean of all X scores = 68.4 inches. 

Y is the given score = 63.5 inches. 

My (M sub Y) is the mean of all Y scores zn 68.3 inches. 

XxY IS the correlation coefficient found between X and Y which 
shows the slope of the regression line = .451. 
ax and o-Y are the standard deviations of the X and Y distributions 
(included to equalize the group dispersions). 

The most probable height of the child is therefore: 

X = rxT^(Y-M^)+Mx 

= [451 ( 63.5 - 68 . 3 )] + 68.4 

= 65.37 

Standard Error of Estimate 

Since the score just found, 65.37, is the most probable score, it is 
of value to find out how accurate the prediction is. A particular son's 
height may, of course, vary considerably from the most probable 
height. The probable amount of this variation can be found by cal- 
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culating the standard de\iation oi the estimated score This is called 
a standmd n^oi of eshmale, and is ^vritten o-x y ^^heii the \aiiauon 
of a predicted X score for a gnen value of Y is desiied Fioiii a stand- 
ard error oi estimate the range oi scoies ■which will include any par- 
ticular proportion of childien m a particular roi\ can be louiid. 

A general standard enoi of estimate * to apply to all probable scoics 
is given m the ioimula o-x Y = orxVl — ^Vhen applied to the 
data m Ulus 136, it becomes 

(Tx y = 2 55Vl - (451)* 

= 2 55 ( 887) 

= 2 26 

From this it appeals that the middle 68 per cent of the children 
who have parents ol a particular height wdll fall within heights rang- 
ing from 2 2() inches above to 2.26 inches below" the most probable 
score. Foi instance, when the most piobable score oi a child is 65 37, 
the middle 68 per cent oi the children w"ill lange from 63 11 to 67.63 
inches in height 

We can also picdici w'hat are the chances in 100 that a child of par- 
ents of a ccitain height will attain any gi\en height, by finding the 
proportion of all the children oi parents ol this height wdio have 
reached the given height This is done by finding the difference be- 
tween the most probable height of chilclren and the given height, 
and dividing this difference bv the standaid crior of estimate. The 
result of this cluiaion will be the niimbei of standaid deviations from 
the most piobable height, and tins can easily be translated Irom 
Ulus. 129 into peicentages Suppose we wish to know what the 
chances are that Inank, whose parents a\crage 63 5 inches, wull be 62 
inches or less in height. We have just found that his most probable 
height wnll be 65 37 inches and the standaid erior oi estimate is 
2.26 inches. 1 he difference, 65 37 — 62, divided by the standaid error 
of estimate gives 1 49 Ulus 129 shows that —1 49 standaid deviations 
is equivalent to the 6 9 centile We may, thcreioie, conclude that 
Frank has only about 7 chances in 100 oi being as short as 62 inches 
These calculations may seem a bit complicated at first, but, after 
working out several examples, the routine becomes easy In practice 
it is often desiiable to make such piedictions 

8 The foimiila is only approximate, since it gives, one standaid error of estimate 
for all columns From inspection of a scattci diagiain one can ^ee that this is not 
actually the case I he columns near the end have smallci di%persions than those 
in the middle The standard enor of estimate is the average standaid deviation 
of the columns It is common]) used for individual picdicti ,ns 
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Standard Error of a Score 

The same logic has been applied to prediction of the probable 
variation of any single score on a test. It has been shown that if two 
tests correlate highly, the standard error of estimate of a score pre- 
dicted from one test to the other will be very small. Similarly, if a test 
has a high self-correlation on two trials, the standard error of estimate 
of a score predicted from one trial to the other will be small. Since 
in this case the same test has been repeated, some authors conclude 
that the tendency for a single score to vary by chance from its true 
value can be shown by the formula for a standard error of a score, 
which is expressed thus: o-oo = o-iVl — Here oo indicates the 
theoretical standard deviation of an infinite number of obtained 
scores from the most probable true score, o-i is the standard deviation 
of the actual scores in trial 1 of the test; and r^^ is the correlation be- 
tween trials 1 and 2. 

This formula may be applied only when the means and sigmas of 
the two trials are nearly the same. The standard error of a score 
would not show the true tendency toward variation if the scores or 
their dispersions should become larger or smaller with repetitions. 

The standard error of a score is often a more practical indication 
of the predictive value of a test than the correlation between trials, 
since it shows the amount of variation which may be expected. In a 
revision of the Binet Scale, Terman and Merrill (1937) reported 
that IQ's from 90 to 109 had standard errors of 4 51 when the esti- 
mated correlation between two forms of the test was 924 This means 
that the chances are 2 out of 3 that the true IQ is not higher or lower 
than the obtained IQ by more than 4.51 points when the obtained IQ 
is between 90 and 109. 

Standard Error of a Difference (o-diff) 

Often one is eager to know whether the difference between the 
means of two groups would be likely to occur by chance if the measures 
were repeated a number of times. Suppose that 570 men who succeeded 
in a pilot-training course had a mean score on a physical science 
knowledge test of 142, SD 10, and 211 men who failed the same course 
had a mean of 141, SD 7, what is the probability that the difference 
of one point between the means would happen by chance if the same 
groups were measured with the same test a number of times^ 

Because the work of retesting would be so great, estimates are 
generally used. We can estimate the probability if we assume that 

a. The differences will fall into a normal curve if we measured 
them a large number of times. 
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6. The size of the difference depends upon the range of scores 
of both groups 

c. The reliabilities of the means depend upon the number of per- 
sons measured. 

d. The observed difference can be taken as the best available in- 
dication of the true or average difference. 

Since these assumptions can be demonstrated to be reasonably true, 
the standard deviation of the hypothetical distribution of differences, 
also called the standaid eiror of a diQeiencej is given as: 

CTdiff = V O-mi + CTmi 

For the pilots’ scores cited above the standard error of the differ- 
ence is. 

craiff = + .49" = .54 

When the dillercncc between means is based on two scores of the 
same group ol persons, the correlation must be considered, for a 
high correlation means, among other things, that there is little ran- 
dom shifting ol scores. The standard error of this dilfeicnce is ex- 
pressed thus 

Thus, if 570 men shotved a mean of 142 (SD 10) on a test before 
training, and 152 (SD 11) on the same test after training, and there 
is a correlation of 82 between tests, then 

^ Vj2HrTP^^2)T42) (46) 

= V 1764 1- 2i 16 — .3168 
= V 3870’^3T68 
= .0702 

Critical Ratio 


In order to detcrininc just how significant a difference betw'een 
means is, the diflerence is divided by its own standard error of 
estimate This ratio is olteii called the critical latio and is written: 

CR r=— ^ 'J he CR is a w’ldely used index w’hich derives its signih- 

O'djll 

cance from the fact that the SD is a normal curve ahvass bears a 
definite relation to the area or centile scores ol tire distnbulion 
These arc shown in Ulus 121 and lllus 129 Scveial elaborate pub- 
lished tables gi\e these figuies in six or more decimal places For 
ordinary 2 )ur poses tw'o decimals are enough. Thus, in the example 
of successful and unsuccessful pilots given above, the critical ratio 


is: CR=:- 
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Examination of Ulus. 129 shows that 1.85 standard deviations be- 
low the mean equal a centile of 3.32, which means that the dijfference 
will probably be as low as zero by chance in a little more than 3 times 
in 100. 

If the CR is more than 3.00, the hypothetical differences would be 
as small as zero only 14 times in 10,000, because the normal curve 
has only i^o^^^^^^ths of its area as much as 3 SD*s below the mean. 
Hence the probabilities are small that a difference which is three 
times Its own SD will occur by chance. 

APPLICATIONS OF CORRELATION TECHNIQUES 

The use of correlation techniques has spread remarkably during 
the last 20 years. Every field of psychological investigation now makes 
use of them in situations where relationships are to be appraised, A 
complete list of sudi situations would be too long to present here, 
but the main applications of these techniques include related per- 
sons, related scores, and related items. 

Related Persons 

Pairs of persons who are related in some ways are measured and the 
score of one member of a pair is plotted against the score of the other 
member. This method is of prime importance in studies of inherit- 
ance where familial resemblances are being scrutinized Studies of 
pairs of identical twins, of mothers and daughters, of fathers and 
sons, of siblings, of cousins, and of grandparents and grandchildren 
have shown typical relationships by correlation techniques. 

Related Scores 

When a group of persons have been appraised in two situations, 
the scores in one situation may be plotted against the scores in the 
other. Literally thousands of studies of this sort have been made to 
show relationships of tests or ratings Correlations between test scores 
and later success in school or occupation are commonly used to pre- 
dict success within known limits In animal experimentation cor- 
relations of performance scores with measures of deprivation or ol 
brain injuries have proved useful. 

Related Items 

When a group of items has been rated or ranked for some quality, 
such as difficulty, aesthetic quality, or emotional value, the ranks 
assigned by one judge may be correlated with those assigned by other 
judges. This procedure has been used to show linearity of relation- 
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ships between items when judged by various groups, relationships 
which must be known in the accurate scaling of test items. 

ERRORS IN APPLYING CORRELATION 
COEFFICIENTS 

The widespread use of regression and prediction formulas makes 
it necessary to guard against spuriously high or low con elation 
coefiicicnfs S]3uiiously high or low r’s are sometimes obtained if 
regression hues and predicuon lormiiLis which are not applicable to 
the case aie used Some ot these criors can be avoided railiei easily, 
but others are hard to eliminate Correlation techniques must be 
carefull) interpreted in oidei to give a cleai representation of the 
facts. 

Errors lYhich Usually Lowei Correlations below Theit True Value 

The iollowing six tyi^es oi eiiors tend to make coirelation coeffi- 
cients loivei than they would be li normal groups weie caielully 
measuied 

Giovping of Results in Laige Class Internals. Tf, in the construc- 
tion of a scatter diagiam, lesults arc grouped into too few class in- 
tervals, say two or three, some scores arc treated as if they were 
nearer the center of the group than they are This procedure usually 
results in a slightly sinallei correlation coefficient than would result 
from using fifteen class inters als, because a high con elation generally 
depends upon having wude and accuiately determined deviations 
among a group of scores A. correction for this t)pe of error is given 
in standard texts on statistics. 'Ihc amount of conection is generally 
small, hence it is used lor only the most catcftil kind of comparisons 

A Citived Regiession Line, It sometimes happens that legression 
lines are curved lathei than straight, as in Ulus 137, Diagiams 5 and 
6. Curved regression lines may be caused by skcw'ed distributions or 
by unequal steps in a scale. A pioduci-raoment correlation coeffi- 
cient loi such situations wull be too loiv The true con elation coeffi- 
cient may be found by using the eta formula, llicic aic scscial w'ays 
of finding out whether a regression line is curved, without actually 
drawing a large scattei diagram These procedures are given in more 
advanced texts on statistics. 

Samples Chosen fioin Pait of a Gioup Correlation coefficients 
from parts of a group aie usually smaller than those lioiii the whole 
group The reason lor this is that m a small homogeneous group, 
sharp contrasts in ability aie usually lacking II all persons in a group 
make the same or nearly the same scores on a test, tlie correlation 
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with any other test will be nearly zero. Thu's it happens that a cor- 
relation of .50 among college students or among feeble-minded per- 
sons may represent about the same degree of coincidence as a cor- 
relation of .75 in an unselected population Ulus. 141 shows the result 
of selecting a small part of a larger group for correlation purposes. 
Comparable results are difficult to secure when various unrepresenta- 
tive samples of population are studied. A correction for the limited 

range of ability in restricted samples 
is given by Kelley (1924, p 223). 

Random Errors, The correla- 
tion between two difiEerent measures 
will generally be too low because 
test scores or estimates are only ap- 
proximations of true scores Ran- 
dom errors of measurement in one 
test are by definition not coincident 
with those in another test, and 
therefore, they reduce coefficients 
of correlation When the self-corre- 
lations of the two tests are known, 
there are several ways of correcting 
for this reduction, which is called 
attenuation. The corrections are 
small if the two tests have high self- 
correlations. In other words, when 
there are large chance variations in 
each of two tests, their measured 
relationships will show a large in- 
crease if corrected for attenuation. 

Sampling of More Than One Process in a Test. Many tests al- 
low the subjects to succeed by the use of different processes. Thus a 
high score in the Minnesota Spatial Relations Test may be due to 
rapid manipulation, or to methodical planning, or to specific mem- 
ory of a plan of work. This situation is usually true of all puzzles and 
form boards, and of many verbal tests In general, the more able 
persons make their scores through good planning, the less able score 
through rapid manipulation, and either group may use memory. 
The chief factor discriminating between individuals in any test will 
vary according to the ability of che persons taking the test. It is 
probable that such variations in processes lead to lower correlations 
between repetitions of a test than would be found if the same proc- 
esses were measured in all persons in a group. A remedy for this situa- 
tion is to devise tests in which success can be had by only one par- 


ILLUS. 141 SCATTER DIA- 
GRAM SHOWING THE 
EFFECT OF SAMPLING 
NARROW PORTIONS OF 
A NORMAL GROUP 
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ticulp combination of* processes The construction of tests of this 
sort is a difficult and challenging problem. 

Use of Raw Scores Instead of Ranks. It sometimes appears that 
the correlation between raw-score patterns on two variables is low 
when the relationship between ranks as shown in an individual's 
profile is high This is found, for instance, in comparing interests 
with vocabulary. We cannot ascertain by correlating scores, whether 
a person's field of greatest interest corresponds to his field of greatest 
information, since a low score might in the case of a poorly informed 
person represent his field of greatest information. This problem can 
be solved by finding the correlation between the ranks of the test 
scores and the ranks of interest scores in an individual’s profile. The 
tests may be assigned ranks by first plotting sigma scores on an in- 
dividual profile, then noting their order. An illustration of the dif- 
ferences between correlating raw scores and correlating ranks in 
individual profiles was reported by Fryer (1931). The correlations 
between scores in academic tests and preferences for academic sub- 
jects were ncaily zeio When the coi relations beween these same 
variables were based on rank's m individual profiles, they were much 
higher, with a median of 60 This was taken to indicate that a per- 
son’s highest preference is likely to be his field of highest academic 
achievement 

Errors Which Raise Correlations Spuriously 

There arc tliice main types of errors which make correlations too 
high. 

Sampling of Exto ernes Tt is possible to choose persons from a 
group so that relationships appear higher than they would if the 
total population were included I'his would hapiicn if one omitted 
a number of cases Iroiii the middle of the group (llJus 1^11) as the 
greatest vaiiations between scores on two tests are usually found in 
the middle two thuds of the cases Although such selection is not 
common, it occurs occasioiiallv in samples from special clinics or 
employment situations 

Sampling a Thud Vaiiable Tf one compares the scores of two 
tests for a gioup composed of children who range liom six to twelve 
years of age, the correlations will be spuiiouslj high, because of the 
high correlation with age which nearly all skills have duiiiig periods 
of rapid giowth Younger childicn make lower scores on all tests, 
because of less maturity 

A similar error is introduced when IQ’s fiom two tests of children 
of various chronological ages are correlated. IQ’s are secured by 
dividing the MA by the CA. I’he division of AfA’s by CA’s results in 
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artificially raising the correlation, for the CA.'s used for each child 
are the same in both tests. 

There are two methods of avoiding these errors. The common 
procedure is to correlate only the scores within separate age groups. 
The other method is to calculate what the correlation would be if 
the effects of age were removed. This can be done by the partial- 
correlation technique shown in any good statistical manual It is 
inadvisable to use partial correlations if the spurious factor can be 
removed directly. 

A similar situation is often found in interpreting a correlation 
between any two tests. Thus, speed-of-reading tests and arithmetic 
tests both involve cooperation with the examiner. If poor coopera- 
tion and poor ability go together, as is often the case, the correlations 
found between scores on the two tests will be higher than correla- 
tions between pure measures of ability would be. This problem of 
analyzing the factors which raise or lower correlations is a complex 
and persistent one. It is discussed in greater detail in Chapter XIV. 

Correlating a Part with the Whole. In many situations one wishes 
to find out whether a part of a test is consistent with the whole test. 
The correlations between single items and total scores are likely 
to be spuriously high if the item makes up a considerable part of the 
total. Corrections are not large enough to be considered essential if 
the item to be evaluated is one of ten or more items each of which 
has equal weight in the total. 

HOW LARGE MUST CORRELATIONS BE 
FOR THE PURPOSE OF PREDICTIONS? 

A practical question is often asked, is this correlation high enough 
to be useful for predicting individual successes or failures^ From th e 
discussion above it is dear that no definite limits can be set, and that 
a correlation must be interpreted in its total situation. One must 
always take into account the factors which have just been discussed 
because they may introduce errors into the correlation coeffidents. 
If errors have been eliminated as far as possible, then there are two 
ways of evaluating the significance of a correlation. One shows the 
standard error of correlation, the other gives an indication of how 
much the standard error of estimate is reduced by an increase in the 
size of the correlation. , 

Probable Error of a Correlation Coefficient 

The standard error of a correlation (<rr) is used to indicate the ex- 
tent of errors in a random sampling. It shows how much the co- 
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efficient will probably va'ry on a chance basis, that is, i£ the same or 
similar groups of persons were measured in the same way many times. 
The 0*1 shows the theoretical range of the middle 68 per cent of r*s 
obtained by random sampling. If an r is three times its own o-r, it is 
highly improbable that the r*s true value will be as low as zero, since 
this would occur less than once in a thousand trials Both an increase 
in the total number of cases in a group and an increase in the size 
of the correlation will result in a decrease in the size of the o-r* The 
calculation of the cti is shown in Ulus. 136 using a formula, however, 
which is not recommended for all correlations. For a more complete 
understanding one must consult an advanced text. 

The (Tjr tells the probability that an r will vary. Thus, if two coef- 
ficients, one of 50 and the other of .90, have the same arj they are 
equally variable measures. This is a valuable bit of information since 
it tells how much the correlation coefficient may be expected to vary 
upon retrials. Tt does not, howevci, lell how much bcttei the picdic- 
tion for iridiMdiuil cases is wJien the r is 90 than when it is 50. The 
inaccuracy of a piecliciion is indicated by the coefficient of alienation. 

Coefficient of Alienation (k) 

It will be recalled that the prediction of the most probable scoie 
in one variable liom a scoic in another, was made b) means of the 
regression equation 'Fhc \ariabiJity ol the most probable score w'as 
then evaluated by means oi the standaid ciior of estiniate Since the 
standard ciioi ol esLimatc was calculated from the con elation coef- 
ficient, It IS possible to show the relationship between the^e iw'o in- 
dices. If the con elation between variables X and Y is zero, then the 
standard error of estimate (o-x y) the same as the standaid deviation 
for the whole Y distribution In this case the errors in predicting scores 
on one variable (Y) liom those on another varialile (X) arc maximum. 
The only prediction that can be made is that one’s Y score will be 
somewhere among all the Y scores, and that one’s most probable Y 
score will be the mean score II the correlation is laiger than zero, the 
standard error ol estimate wmII be smaller than the standard deviation 
of Y. The piopoiLion by which the standard deviation of Y is reduced 
is called the coefficient of aUeuatioii This coefficient is equal to 
yTZTF and its magnitude for various con elation coefficients is 
shown in Illus 142. Reading from this table, the coefficient of aliena- 
tion for the correlation coefficient for 451 in Illus lJ-2 is approxi- 
mately .898 In other w'ords, the standard eiTor oi estimating a child’s 
height is 898 of the standaid deviation of the entire group oi chil- 
dren 

It also appears liom Illus 1 d2 that the correlation must be .90 to re- 
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ILLUS. 142. COEFFICIENT OF 
ALIENATION k FOR VALUES 
OF r FROM 00 TO LOO 
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(Garrett, 1937 By permission of 
Ixingmans, Green and Co.) 


duce standard error of estimate to 45 
per cent of the standard deviation of 
the total distribution. The correla- 
tion must be 995 to reduce it to 10 
per cent. This means that it is wise 
to be cautious in placing faith in in- 
dividual predictions made when the 
correlations are less than .90. Since 
most of the correlations reported 
between raw scores and various cri- 
teria of success are in the neighbor- 
hood of .50, predictions for individ- 
ual success on the basis of a single 
comparison are usually too far from 
accuracy to be useful. This fact does 
not discourage the careful worker, 
but makes him realize the need for 
much more precise measures. 


STUDY GUIDE QUESTIONS 

1 Define concisely: scattergram, bar chart, regression line, correlation 
coefficient 

2. When should bar charts be used to show relationships? 

3. What is the logic of the correlation by Pearson? 

4. When should the biserial r be used? 

5. What are the advantages of the Kuder-Richardson Reliability Formula? 
6 Define standard error of a score, standard error of a mean, standard 

error of a difEerence, critical ratio, and level of significance. 

7. What kinds of situations make a correlation spuriously high? Spuriously 
low? 

8. Of what use are standard errors of coefficients and coefficients of aliena- 
tion? 
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FACTORIAL ANALYSES 




ASSUMPTIONS 

The previous chapters have described the main categories of behavior 
and how particular samples of skills and inCorniarion can be given 
numeiical icprcscniation Anothei persistent pioblem in the study 
of behavior is approached in this chaptei, namc]>, how aie abilities 
of persons lelated^ Answers to this question are laiily numerous and 
usuall) hypothetical Persons aie so complex that patterns of per- 
sonalit) lia\e not as >ct been well established 

There aie, however, two method:^ widely used lor the anahsis of 
mental oigani/ation one, biogiaphical, the other, statist ical The 
biographical ineihod records a sequence ol events which are related 
in time and space It has the advantage of showing apparent causal 
relationships and, if carefully done, it hirnishcs a \aluablc analysis 
of trends However, it usually iails to provide a basis ioi the most 
adequate compaiison between persons, and to give numerical anal- 
yses of what might be considered elements or forces in the patterns 
described. 1 he second, oi statistical method, shows the coincidence 
between various measures ol persons ft may be used to studv either 
simultaneous lelaticjiiships or temporal sequences Eventually, logical 
syntheses oi such quant native icsults vielcl a body ol natiiial 

The psychologists seaich for elements ol behavior is much like the 
chemist’s search jor earth elements of fifty yeais ago The chemists 
secured samples ol organic and inoiganic compounds and subjected 
them to rigid tests which slunved their reactions to vaiious forces, 
such as giavity, elcctncity, pressuic, and heat On the basis of these 
tests the samples were classified into groups ol compounds which 
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behaved similarly. After years of research" it was found that com- 
pounds which behave in similar fashion usually have common ele- 
ments which can be isolated An element is described as a unique sub- 
stance which cannot be divided into other substances Altliough the 
isolation of an element is usually an important step in its recognition, 
isolation is not absolutely necessary for recognition An element may 
be described and measured accurately (a) if it is present in different 
amounts in a number of the samples which are available, (b) if it 
reacts to the tests differently from otlier elements present, and (c) 
if its reactions to the tests are the same in combination and in isola- 
tion. This sort of analysis depends upon having a large number of 
samples which are qualitatively and quantitatively different. 

Similarly, persons may be taken as the samples to be analyzed, 
subjected to various tests, and classified according to similarities of 
responses. For the analysis of factors in test scores, two assumptions 
are usually made. 

1. All the scores on a given test represent the same factors. This 
assumption is not representative of a large number of tests when the 
same scores represent different ability patterns, but it is representa- 
tive of carefully controlled situations Unless one is sure that a par- 
ticular score represents the same pattern in all persons at all nmp s, 
the comparisons are inconclusive. There is no absolute criterion to 
which one can appeal. Here, as in all other sciences, one must rely 
upon the agreement of observers who are considered to be competent. 

2. A high correlation between two tests indicates similar processes 
in boA tests This assumption is also likely to be false in many test 
situations. A high correlation simply indicates that all persons have 
the same relative place in each test distribution It does not show 
what factors have caused them to take these positions m the group. 
Only m carefully controlled situations may one make this assump- 
tion. 

From these two considerations it is clear that a hi gh or a low cor- 
relation is most significant when the test conditions are rigidly con- 
trolled. Few, if any, studies of persons have appeared that allow an 
uncontroversial interpretation of the factors involved m a correla- 
tion, but so much work of a preliminary nature has been accom- 
plished that no discussion of mental measurement can ignore it 
This chapter gives a brief introduction to mathematical methods 
that have been proposed for analyzing test relationships, a field of 
study in which the methods are still rapidly developing. 
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FACTORS 

The last fifty years have witnessed a marked development in the 
analysis of psychological data by means of techniques which assume 
that the relations found between tests are due to the existence of 
common elements or factors, A factor is defined as some element in 
the test which can be distinguished from other elements. Factors may 
be distinguished psychologically by observation, or mathematically 
by showing whether persons who are high on one test are high on 
another If scores on two tests have a zero correlation with each other, 
they do not exhibit any factors in common. One of the main prob- 
lems is the reconciliation of mathematical and psychological factors 

SPEARMAN’S CONTRIBUTION 

Spearman (1904) searched for the simplest explanation of the 
relations seen in correlation matrices A correlation matrix is a table 
which shows the correlation of each test in the battery with each of 
the other tests, as in Ulus 143 He reported that the relation shown 
in Ulus 143 could be ascribed to one general factor which was present 
in all five tests in various amounts. In this case he showed that every 
individual score of every ability or attitude which is represented in 
the matiix could be divided theoretically into t^\o independent kinds 
of factors: one ihe general factoi, called g, and other specific factors, 
called 5, which vaiy lioiii lest to test A numbei ol other explanations 
of the relations shown in Tllus 1-13 are, of comse, possible, but Spear- 
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man prefeiicd this one, since it was the simplest explanation of the 
intercorrelations, and it coi i esponded to his jjsychological analysis 
of the tests. For Spearman’s aigumcnts and mathematical proof one 
should consult his Psychology through Ihe Ages (1938), or Cuilfoid 
(1936). 

' Spearman found that a g lacroi would not account for iiitercor- 
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relations in all matrices, but only in thosfe which were limited to 
tests of abstract comparison, as in Ulus, 143. When tests were in- 
cluded which depended principally on motor coordination or sensory 
discrimination, one general factor could not completely account for 
all the correlations of the matrix It was then necessary to include 
g^ouf} factors to account for the correlations. From these studies he 
cone luded that mental organization can be most adequately described 
as due to three kinds of factors: a general factor, found in nearly all 
men Lai tests, group factors, found only in groups of similar tests; 
and specific factors, found only in one test. 

The General Factor (g) 

From the analyses of many batteries of tests, Spearman and his 
students have tentatively named some of these factors The general, 
or g, factor is found in large amounts in tests of abstraction, whether 
vei bal or nonverbal, and in smaller amounts in tests of perceptual 
disci immation. It was not found in tests of rote memory or rote 
learning. Speed and accuracy are interchangeable measures of g 
when complexity is held constant The g factor is therefore described 
by Spearman as a special sort of energy which can be applied to mak- 
ing comparisons or drawing inferences Some have thought this a 
good definition of intelligence, but Spearman has pointed out that 
most intelligence tests have large rote memory factors, hence they 
arc not good measures of g, although some amount of g is usually 
present. 

Group Factors 

The group factors which have been isolated by Spearman seem to 
fall into two general classes: those due to independent abilities and 
those due to variations in g The ability factors are called mechanical, 
arithmetical, musical, logical, and psychological. The mechanical 
and arithmetical factors are to be found in spatial and number tests. 
I’he logical factor is independent of the g factor, for from Spearman’s 
aiialvsis it is a technique of logical comparison which is different 
fiom a general mental energy. The psychological factors, of which 
there ivere four, are described as bases of judgments used in making 
decisions concerning (1) concrete objects, (2) abstract ideas, (3) moral 
concepts, and (4) interest and pleasures. It was found that some per- 
sons excelled in one kind of judgment and some in another, so that 
correlations within these fields were higher than would be expected 
from the presence of a g factor alone. A group factor of goal con- 
sistency, also called will, is indicated by the fact that higher correla- 
tions than can be explained by g alone are found in tests which 
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demand a persistence of'motive. Spearman (1927) did not find evi- 
dence of group factors for speed, attention span, sensory discrimina- 
tion, motor coordination, or language. More complete research by 
Holzinger (1935), however, recognizes group factors for verbal abil- 
ity, motor and mental speed, attention, and imagination. 

The group factors attributable to variations in the g factor are 
called persevenation and fatigue. These are typically character traits 
Perseveration is indicated by ease or difficulty in changing quickly 
from one activity to another Fatigue is indicated when a person's 
scores in one test decrease after a fatiguing experience more rapidly 
than his scores in a different test Groups of persons ha\ing the same 
fatigue patterns will shoi\ liighci corrcLi r ions after fatigue tlian other 
groups will. 

Much difficulty has been exj.ciicnced in distinguishing among the 
general, the group, and the specific factors, tor insLtince, il the \aii- 
ety of tests in a battciy is small a g (geneial factor) may be found 
common to all of them 11 these same te^ts aie included among a 
larger variety of tests, thisg factor might become a gioup factor Like- 
wise, in a small battery of tests it is quite likely that a factor ^\ill ap- 
pear in one test only, and hence be called a specific facto) But i\hen 
a large number and variety ol tests aic used, these specific Jactois \m11 
often become gyoup factors It seems, theicfoie, that the classification 
into general, group, and specific factors is more dependent upon the 
particular battery of tests used than upon the mental oiganization 
of persons tested. There is leason to believe that Spearman s g factor 
and many s factors will become gioup factors when a laige enough 
battery of tests is used 

Other factorial analyses have been described by Hotelling (1933), 
Kelley (1928), Burt (1938), and several otheis using somewhat dif- 
ferent assumptions and securing different results. The field is still 
a controversial one. All of these workers lia\e devised methods for 
determining (a) which factors account for the relationships of a given 
correlation matrix, (b) how' much of each factor is used in each test, 
and (c) how much is show’n in an individual's score on a test. 

THURSTONE'S METHOD 

The general method of Thuistone (19^6) will be discussed be- 
cause it has been more wudely used than the othets The mathe- 
matical explanation of a con elation analysis should be secured from 
Thurstone's work, but the main assumptions and piocedures wull be 
briefly presented heie. 
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Analyses of a Score 

Since test scores are basic materials, a working hypothesis is needed 
Cor their analysis Thurstone, as well as almost all other statistical 
analysts, defines a test as a situation in which a person's position is 
detei mined by a number of forces ^ The simplest possible case would 
be that in which only a single force is operating. This is shown in 
Ulus 144A, where all persons in the group may be represented by 
points on a straight line. The direction of the force is indicated by 
the arrow and the amount of the force by the distance from zero to 
the point representing the scores of persons, H, I, J. This is the 
situation diagrammed in any frequency curve, ogive, or histogram. 
The score of H = 5x, of I = lOx, and of J = 15x Any person's score 
may be represented as S = ax, where a is a certain amount of x. 

A more complex score would be that which resulted from the 
operation of two independent forces. This is shown in Ulus. 144B, 

ILLUS. 144. GRAPHIC REPRESENTATION OF INDEPENDENT VARIABLES 



where the position of a person's score can be accurately described as 
the lesultant forces x^ and Xg. In this case H's score = Sx^^ -|- Sxg, and 
I’s score = lOx^ 7 X 3 , and any score would be S = a^^x^ + agX^. 

A still more complex test might involve three independent forces 
as shown in Ulus. 144C. Here H's score on Test A must be represented 
by. 

H = 5xi + 4Xj5 -I- 7xj 
I = 5xi 4- 7 x 2 4- lx, 

and any score by: 

S = a^Xj, + ^ 3 X 3 agXg 

1 The use of the word force in this chapter should not be interpreted literally 
as a mechanical entity in all of its applications In some instances it simply refers 
to a parameter m a system of measures. 



FACTORIAL ANALYSES 


407 


From these figures it is apparent that the plus signs do not indicate 
simple addition of similar elements, but a combination of particular 
amounts of forces working independently of each other. Such forces 
are called vectors Their independence is shown by their different 
directions, since force may vary without introducing any change 
in Xg When two forces operate at right angles to each other, they are 
called orthogonaL In actual space one can distinguish simultaneously 
only three directions at right angles to each other In a mental test, 
however, it is possible to imagine more than three factors which 
would be unrelated. Hence the score of a person on any test may be 
indicated by 

Score s= S = aiXi + a2X2 + asxs • • • anX^ 

Analyses of a Group of Scores on a Test 

Since any score on a particular test may be considered as the re- 
sultant of the component forces, a group of scores inav likewise be 
thought ol as resulting horn the same forces. A group ol scores is 
best described by their \ai lation from a central tendency. The stand- 
ard deviation, which is commonly used as a measuie ol dispersion, is 
a convenient index lor the analysis oi vaiiation of a gioiip ol scoies. 
If only one force is efiectne in a lest, then the standard deviation is 
a direct measure ol the variation ol the loice among induiduals If, 
how’ever, w’e assinne thai two, amj only two, forces are operative, then 
the standaul de\iation is a resultant of both ioices If \\c liniher as- 
sume that the two (oiccs arc inclepciicleiu of each other, the) can be 
diagiamiried at light angles to each other as in Tlliis 1 15 riicic the 
total scoies on Test A aic represented by the diagonal line, and the 
amounts of the two component forces, x and y, by the hoii/onial 
and vertical linos The standard deviations are shown by the points 
under the distribution curses "1 he standaid deviations of the forces, 
X and y, bear a definite relationship to the standard deviation of Test 
A. Accouliug to the Psthagoican theorem, the square of the stand- 
ard deviation of "I cst A (o-aO is ahvays the sum of the squares of the 
standard deviations of forces x and y 

Because the squaie of the sigma is convenient as a measuie in these 
analyses, it is called the vcniance of a group of scores The variance 
of a test winch has only two component lactors can be completely 
described from the variances of those lartois 

When more than two factors are present, the vaiiancc of the total 
score still equals the sum of the variances of all the component factors. 
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ILLUS 145, THE DISPERSION OF TOTAX SCORES AND OF 
COMPONENT FORCES 



Thnrbtone has shown that the total variance of any test may be 
analyzed into three kinds of factors, called common or group factors, 
specific factors, and chance factors. Common factors are defined as 
tliose which are found in at least two tests in the battery. Specific 
facial s are those found only in one test, and chance factors are those 
due to random variations To simplify calculations, the variance of 
total scores is made equal to one by using standard scores instead of 
raw scores throughout. The total variance of a test then can be in- 
dicated by 

o-J = + O’a/ + Oa/ + (Ta/ + <rAe^ = 1 

where 

A lepresents total scores on Test A. 

Av IS the amount of variation in total scores due to factor x. 

X, >, z, and n are factors common to A and also at least one other 

test, 

s is a specific factor, and c is a chance factor. 

<r^ 2 is the per cent of which is due to x, since all the factors add 
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to 1.00 It IS also defined as the square of the loading of factor x 
in Test A 

Common Factors 

The calculation of the common factors in a test is possible from 
the intercorrelations between tests, because the correlation of any 
two tests IS equal to the covariance of their common factors Hence, 
a product-moment correlation can be analyzed as 

Tab == (Tb^ + a-Ay o^By + O'Bj, ^b^ 

where 

A and B are total test scores. 

<7 Ay, ^A^, and cta are the factor loadings of Test A in factors 

X, y, z, and n. The factor loading in x sho%vs the variation in A 
scores which is due to v ana lions in the x factor It is also a cor- 
relation oi I cst A with factor x o-,. , , and a,, aic the factor 

“x "y 

loadings of 1. cst B in factor x, y, z, and n. 

In a practical teat situation one does not usually know with any 
mathematical cei lainty how many lactois are common to wo or more 
tests Test results, liowevei, may be made to lurnish a con elation 
matrix w'hi(h show's the lelations between tests Thui stone (1946) 
described a uiiicjue solution which deiermincs the smallcsi number 
of factors needed to account for the teat coiielations I he matrix is 
treated as a deteiiiiiiiant and sohed by a method wduch assumes that 
all factor loadings should be positne within the tolerance of sam- 
pling criors, and that the number of zcio loadings should be made as 
great as possible to give unambiguous results 

In Thurstone’s centroid method a central or a\crage value based 
on all coriclations in the riiattix is found From this center, factoi 
axes can be computed and clraw'n as in Ulus. 146, showing the rela- 
tions of tests and clusters oi teats On a Hat suiiacc only two uni elated 
axes can be drawn, but in statistical practice a fairly Jaigc number 
of unrelated axes appear These may be show'n as radii ol a sphere 
or series of spheres One of the principal pioblcms oi analysis is that 
of determining a center If new' tests are added to the batlci) or new 
subjects tested, the ceniei mav change 

Anothci feature oi Thurstonc’s method is the rotation of factor 
axes to yield the smallest loadings Thus, in Ulus. 146 there aie 
two sets oi axes, X and Y, and X' and Y' Each test is first represented 
by its loadings on the X and Y axes. \11 the loadings arc high on 
both axes because the tests lie at considerable distance from either 
axis. In order to picscni a simple pictuie, Thurstoiie rotates the 
axes to give the maximum number of zeio or near-/cro scores and 
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ILLUS. 146. ROTATION O^ AXES 



A centroid graph showing rotation of axes to reduce negative weights and to 
increase the number of near-zero weights Each circle shows the position of a 
test as computed from the X and Y axes. The tests in the lower right quadrant 
have negative Y weights. With the same center a new set of X' and Y' axes can be 
drawn or computed so that one cluster of tests has near-zero scores on the Y' axes 
and the other on the X' axes, and neither cluster has significantly minus weights. 

to avoid minus values. He has shown that this rotation, not only 
gives a simpler structure, but also yields factor loadings for a test 
which tend to remain the same when the test battery is changed or 
when new subjects are used. 

This method gives a fairly complete solution of unique factors in 
a particular battery of tests, without, however, giving the factors 
names which represent particular forces or psychological processes. 
Names of processes are usually given to mathematical factors from 
an inspection of the tests which show the highest loadings. 

Chance and Specific Factors 

The chance variations (c) in test scores may be measured by a 
correlation of two trials of the test, assuming that no systematic 
changes have occurred between trials The variance due to specific 
factors (s) which are not found in any other test can be found by sub- 
tracting all the other elements in the total variance from 1. 
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EXAMPLES OF FACTORIAL ANALYSES 

Two examples are given here to show the results of applying a 
factorial analysis to the scores of a battery of tests of a group. One 
IS Thurstone’s analysis of primary mental traits, the other, Mosier's 
analysis of personality traits. Other examples are given in this book. 

Thurs tone’s Test of Primary Traits 

Thurstone presented a group of college students with fifty-seven 
carefully controlled tests requiring 15 hours of work. The tests in- 
cluded nearly every existing variety of verbal and nonverbal think- 
ing, but did not include adjustments-to-persons or motor tests The 
scores of each test were correlated with the scores of each of the 
others, and the resulting matrix yielded twelve factors, of which 
nine were sufficiently large to be identified The tests which had 
large loadings, and hence were considered to be the best indicators 
ol factors, are shown in Ulus 147. Using these and similar tests, a 
scale loi superior adults has been coiisliuctecl which discloses a 
pel son’s relative position in these primary abilities 

lllustiatioii H8 shows the factor loadings for the tests just described 
togethei with the percentage of the total vaiiancc which is indicated 
111 the h- column, and the retest reliability From this illustiation it 
appears that some tests have fairly heavv loadings in more than one 
factor Thus, Figure Classification shows a loading of .39 in factor 
S, Spatial \ isuali/ation, 10 in factor I, Induction, and 10 in factor 
D, Deduction Such teats aie not of as great analytical \alue as mul- 
tiplication, which IS well accounted lor b> factor N, Number lacil- 
jty It is also clear from the h- column that some of the items aie 
not completels accounted for b> these facioi loadings Since the tests 
are shown to ha\e high reliabilities, it is piobable that tests with 
small h- \ allies, such as Figuie Classification, demand skills or acl- 
|iistincnts noi well represented by any of these factors This condi- 
tion can be remedied only by more extensive stud). 

Mosier’s Analysis of Personality Traits 

Mosier (1937), wdio followed a procedure reported by Guilford 
(1936), selected from several lists of personality traits tlie items w*hich 
seemed most diagnostic for clinical use. These items which are 
shown 111 Ulus 149 weie administered twice, w'lth one week inter- 
\ening, to five hunched male students at the University of Florida. 
The scores lor the fort) tw’o items and also tor the American Council 
on Education Psychological Examination were inlercori elated. From 
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ILLL’S 117 n-STS OF PRIM \RY \BrLITIES 
1 Sp\tim. R''t \ttons (S) 

Ctibr^ ^arUptecl from Rrigh.im, 19?2) factor load mg, 626 

Thp drpw i"gs n this t esL repicsonL cubes There is a different dct’gn on each face of 
the cube A cube has six laccs 

Notice that noth of the drawings below can represent the same cube Be sure 
you see thai the first and second draiMiigs repicsent the same cube tinned into iwo 
difieicnt posi Lions Since both draiiini^s can lepreseni the Siimc cube, a plus Mgn 
(+) has been placed in the blank square at the right 



Notice that the two drawings below represent two dillcrcnL cubes and that a minus 
sign (“) luis been pljccd in the blank ‘iquarc at the right Be cure that >ou see that 
it would be imixissible to turn the cube shown m the first drawing so that it would 
look EXACTLV like the cube shown in the •second drawing Unless >ou see thia 
cleaily, you cannot solve the test itcma Theie is a diUercnt design on each face o£ 
the cube 




71 

4 



1 / 


IB 



2 PERcr pTu 41 Spied, VisuvL (P) 

Word uroupvtg factor loading, 573 

In the line below , not ice that the four words, dog , lion, cat and giraffe, can he grouped 
togethci bcc au-.c they are all names of animals The w oid chair doca not belong w irh 
the oilieis because it is not the name ot an animal Since chau is the aecond worn, 
2 IS written in the blank at Lhe right 

I — clog 2 — chan 3 — cat 4 — ^lion 5 — ^giiaffe 2^ 

Sim-larly, four of the words in the line below can be grouped together because the\ are 
alike in some way, while one of the words docs not belong because it is di/jereni 
Write the nuuibci of that word in the blank at the right 

1 — carrot 2 — radish 3 --beet 4 book 5 — turnip 

3 NuuBLit C\icri \rroNS (N) 

AddJion factor loading 755 

Aduition is a •simple lest ol ordinary addition of seven two-digit numbers 

17 

61 

93 

21 

U 

17 

65 

Mulliphcation factor loading, 812 

Multiplii ation is a •simple tcat involving the multiplication of six-digit numbers by a 
single-digit number 

7245086 
4 
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ILLl'S 117 TESiS OF PRnrVRY Alin rriFS (Confil) 

4 VnRBVL Rt:i (V) 

Jnvcutne (>ni)os^U^ I.ilIof lo-ulinff 
Tins IS A test oi iibihly to think of word'* Think of two (hn'urrnL words opnosite 
111 metiiiing to iho word initrow below One woid should bc/tin with b The otner 
should boG;in w ilh ^ '1 he w orda »u e Iwoed *ipd w «de T>'»i'aL vs ord** ha\ e been w ntLen 

in ihe blank* 

narrow b w 

Xow think of t.vj words opooaic in meaning to ihe won! The first should 

begin with \, iho seiond with b 

k’lge 1 «; 

The words are 1 lUc and small Wiile hi tic i.i the lust blank Write small in the 
second 

sLiong t w 

wrong r ( 

(Ku k b I 

'S, “Word FoR^fs (W) 

Auiif;r(m\ fai tor loading, 5 “1 1 

Ma>ve a* man> differe it wouK ps you can, using only the letters in the word G-r.-X- 
E-R- V- r-I-O-X -S You PI IV use long oi *nort words anil may include tne namC'* of 
pcisons, places, oi fori ign woicN In a.o one word do ii'>n uae a leltor more times than 
It appears in Ci-L--N L -R-A- I’-I-O- X -S 
baaiple wold^ have been wiitten in the first lew lines Continue wnmig as many 
words as you can, uaipg only the lot tors givcii 

0-l>\-L-R-A-T-T-0-X-S 

1 \kT 

2 ERA 

3 SNORE 

4 

Dimrranf'ed Words factor loading, sl2 

Rearrange the letters on each of the following lines to spell the name of an animal 
Tn the first line the letters (eb ii) c.ii be arr« nged to spell hear, whi< h i.-* w iitten in ihe 
blank space In the neu line, the letters (odg) spill dog, wiiieh la written in tne 
blank space In the same way the iettcis (ale) siiell eat 

AXTM.VLS 
ebar boai 

odg dog 

ate eat 

Rearrange the letters on each of the following lines to spell the name cf a boy 
The tirst two names have already been written for >ou WTiie the tUiid 

BOA’S XAMES 
Ipau Paul 

rela Coil 

honj 

Rearrange the letters on each of the following lines to spell the name of a b'rd 

BIRDS 

uekd _______ 

cow r 

wahk 


6 ImreniitTE Meleorv Spvn (M) 

Word number faetor loadinir, 529 

Word mimbei was prcpired as a test involving mcirorizing The sub'ect ireironres 
a set of paiicd assoeiaics Each stimulus w'ord is to be associated with a response 
number Fn the iceall the subjeit is given the stimulus word, and he is asked to write 
the corresponrling response number 'fho test is* arranged w ith iiisirurtions and a fore- 
exercise followed by a recall A second foic-exeieiac, wliitli is. longer, is then given 
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ILLUS 147. TESTS OF PRIMARY ABILITIES (Cant’d) 

It is followed by a recall. The test proper with twenty words and assoaated numbers 
is then presented This is followed by the recall The test really consists of three 
sections with a presentation and recall m each section. This was done m order to 
make sure that the subjects understood the nature of the task. 

Number’-number* factor loading, 664 

Number-number is a paured-associates test m the same form as the two previous tests 
in which the stimulus consists of a two-digit number, and the response is another two- 
digit number. This test is also given in two sections with two parts for each section 
Together with the instructions the subject is given five paired numbers to associate 
He is then asked to recall the five response numbers when the stimulus numbers are 
presented He is given the opportunity to wnte the numbers if he wants to learn them 
m that manner Then follow the memonzmg of the twenty pairs of numbers and the 
recall m which the twenty stimulus numbers are presented in random order The 
subject is asked to fill m the response numbers. 

7. Induction (I) 

Number series factor loading, 503 

The numbers in each row of this test follow one another according to some rule You 
are to find the rule and fill in the blanks to fit the rule 
In the example below each number can be obtained from the one before it by the 
rule add 2. The blanks have been filled m accordmgly 

2 4 6 8 10 J4 

Find the rule in the series below and fill in the blanks You may use addition, 
subtraction, multiplication, division, or any combination of these 

10 8 11 12 13 

The above senes goes by alternate steps of subtracting 2 and adding 3 You should 
have written 9 and 10 m the blanks 

Find the i^e in each senes below and wnte the numbers m the blanks accordingly 
There is a different rule for each hne. Go right ahead. Do not wait for any signal. 


19 

18 

17 


15 

14 

8 

11 

14 


20 


27 


23 

23 

19 

IT 


Ftgure Classtficaiton: factor loading, 405 (Also called Spearman’s Form Analogies 
Test) 

In each hne below, there is a rule by which the symbols m Group I differ from those in 
Group II There is a new rule for each Ime. Your problem is to discover the rule 
m each line Some sample problems are worked for you below 
In the first line below the rule is that the symbols m Group I are horizontal while 
those in Group II are vertical Each of the test symbols at the right belongs either to 
Group I or to Group II The test symbols that belong to Group I have been c he ck e d 


Group I 



Group n 



Test Symbols 


1 — 1 


1 

CXI 






The rule in the problem below is that the figures in Group I are closed while those in 
Group n are open Now check the test symbols that bdong to Group I 


IQ 

m 

i] 


H 

I 

1 

a 

Tests 

A 

ymbols 

O 

0 

7B 
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ILLUS 147. TESTS OF PRIMARY ABILITIES (Con^d) 

You should have checked the third and fourth symbols They are dosed figtires. 

8 Reasoning (R) 

ArUhmettcal Reasomng* factor loading, 583 

Anthmetical Reasoning contains nmeteen problems and is siinilar to current tests of 
this type 

Mechamcal Movements' factor loading, 414 

In this test you will be shown pictures of mechanical movements You will be asked 
questions about them 

In each picture the part that makes the others move is called the driver The solid 
black cirdes represent axles which can turn but cannot move from where thQr are 
shown 

Now answer the questions after each of the pictures below. Go nght ahead. 



1 If B starts moving m the direction shown, which way will A move, 1 

or 2 ? 

2. In which direction wiU A be moving when B has turned half way around from 

where it is now ^ . . " 

Vocabnlary^ general factor loading, 545 
Word Knov ledge is the Thorndike Vocabulary Test 
9 Dl DTT( IION (D) 

FoZsc Premnies factor loading, 578 

This is a test of >our ability to tell the difTcrcnce between good and bad rcasomng 
You must judge only the rcasomng m the following arguments because evei> statement 
IS false or c\ cn absurd 

The first argument below is good reasoning and is marked plus (-p) The second 
argument appear^ similar but is bail reasoning and is marked minus (— ) 

All he j stalks are catfish Ml catfi<ih are typewriter^ 

Thei cfore all haystacks are t^-pewnters 4 

All hi* j stacks are typewriters All catfish arc lypcwTitcrs 
'I herefore all hay-»tacks arc catfish — 

Reawiivg factor loading, 525 

This lb a test of jour ability to tell the difference betw'een good and bad rea«soning 
The first argument below is good reasoning and is marked plus (+) T he second 
argument appeaib similar but is bad reasoning and is marked minus ( — ). 

All sports are deiigerous, and football is a sport 
Tiierefore, football is dangerous + 

Some sports are dangerous, and football is a sport 

herefore, footbfill is dangerous — 

Now mark the two arguments below in the same waj- 

All wealthy men pay taxes M i White pays taxes 
Therelore Mr W hite is w ealihj 
Ail wealthy men pay taxes M'" White is wealthy 
Therefore, Mr WTiite pavs taxes 

The first argument above is had and should have been marked minus (— ) The 

second should have been marked plus (+) 

(Arranged fi:nm Thurstonc, 1938 By pei mission of the Editor, P^ychotnetrika 

Monographs ) 
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(Arranged from Thurstone, 1938, from Tables 3 and 4. By permission of the Editor, Psychometnka Monographs ) 
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ILLUS 1-19 QlTSnONNAlRE in- MS ON MICRO TIC BLIT VVIOR 

1. Do you get stage fright^ 

2. Do you have difiiculLy in starting conveisation with a stranger^ 

3. Do you worry too long over humili.iting CTfiienenrts ^ 

4. Do you oltcn feel lonesome, e\cn when you aie w ith other people? 

5. Do you consider your-self a lather ncivous peisoii ^ 

6 Aie youi feelings ca^^il} hurt •* 

7 Do you keep in the background on social occasions-* 

8 Do ideas run tliiough your h(‘ad so that you cannot sleeps 

9 Are you frequently burdened by a sense of remoise ■* 

10 Do y ou w 01 ry over possible misfortune ^ 

11 Do youi feelings alternate between happiness and sadness without apparent 
reason ^ 

12 Are y'ou troubled witli shyness^ 

13 Do you day-dream fiequcntly ^ 

14 lla\e you ever had «!p('Ils of dizzmess? 

15 Do y'ou get di*»cou raged easily ^ 

16 Do your inteie&tb change quicUy^ 

17, Arc you easily moved to tears ^ 

18 Docs It bother you to have ijcople watch you at work, even when you do it 
well ■* 

19. Can you stand criticism without feeling Imrt^ 

20. Do you have diliiculty nuking friends ^ 

21. Aie 3 ou tioubled w ith the idea that people arc w atchiug you on the street * 

22 Does your mind often wander badly so that you lose track of what you arc 
doing ^ 

23. Have you ever been depressed because of low marks in school'* 

24 Aie you touchy on vaiioiis subjects^ 

25 Arc y'ou often in a state of excitement 

26 Do you frequently feel grouchy- 

27. Do you feel self-conscious when you recite m class ^ 

28 Do you often feel just miserable ^ 

29. Docs some particular useless) thought keep commg into your mind to bother 
you^ 

30 Do you hesitate to volunteer in a class recitation? 

31 Are you frequently in low spirits^ 

32 Do y'ou often experience periods of loneliness'^ 

33 Do you often feel «'Clf-conscioub in the i^rcbcricc of superiors^ 

34 Do you lack self-coiifidciice^ 

35. Do you hnd it diflicult to speak in public^ 

36. Do you often feel self-conscious because of your personal appearance-* 

37. If you bce an accident, are you quick to take an active part in giving help ? 

38. Do you feci that you must do a thing o\cr several times before you leave it^ 

39 Arc you troubled wnth feelmgs of inferioiity^ 

40 Do you often find that :gou cannot make up your mind until the lime for 
action has pas^-cd ^ 

41. Do y-ou have ups and downs m mood without apparent cause? 

42 Are you in geneial self-confident about 3 'our abilities^ 

43 (Above the median, .ACE Psychological Examination ) 

(Mosicr, 1 937, p 280 B> permission of the Editor, P^ychometnka ) 
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the matrix of 780 correlations, eight independent factors emerged 
through the use of Thurstone’s Centroid method. These were re- 
duced by graphic rotation to the simple structure shown in Illus 150 
The highest loadings in each factor are underlined. The tentative 
interpretation of these factors is given on the basis of those items 
which show the highest loadings: 

1. Cycloid: The first trait is closely identified with the tendency to have 
wide mood swings, 

2. Depression • This is shown by feelings of loneliness, sadness, deprecia- 
tion, and anxiety. 

3. Hypersensitivity: This trait is psychological rather than physiological 
It is represented by hurt feelings and an inability to stand criticism 

4. Inferiority: This is shown by lack of self-confidence both in social sit- 
uations and mechanical and thought problems. 

5 Social Introversion' This is typical of persons who are shy and self- 
conscious in small, informal groups. 

6 Platform Shyness: This is associated with stage-fright and appearance 
among strangers. 

7. Mental Ability, also called cognitive defect. This is shown by poor 
grades in school, and on the American Council Test. 

8. Autistic Tendency* This is shown by frequent daydreaming and wish- 
ful thinking of an emotional sort. 

Illustration 150 also shows how factorial analyses may be used to 
construct better scales by selecting items which are completely ex- 
plained by only one factor. If an item shows heavy loading in more 
than one factor, its interpretation will be difficult. Thus, item 1 1 is 
good in the sense that it depends upon only one factor, but item 2 
is less valuable for analytical purposes because it has moderate load- 
ings in two variables. 

The per cent of variance due to all the factors is shown in column 
h*. Item 41 is good because nearly all its variance is accounted for 
in this analysis. It has small random or specific loadings to reduce its 
value as an analytical tool Specific factor loadings can probably be 
reduced still more by factor analyses of more elaborate scales. Item 16 
IS of little value for careful measurement because its h^ shows that 
only 28 per cent of its variance is accounted for. 

Mosier interprets his results to mean that each of the factors will 
probably not be found on subsequent research to contain elements 
found in any other factor. 

Each factor may, on subsequent analysis, appear to be a composite 
of several other more basic and elemental factors. The existence of 
these eight independent factors is supported to a considerable extent 
by the reports of Guilford and Lacey (1947), Whisler (1934), Vernon 
(1938), Layman (1937), Cattell (1947), and several others. 
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ILLUS ISO FACTOR LO\I)INGS OF PERSONALITY TRAITS 


Item 

' 

Loadings on 

Primary 

Traits 1 

[Sec Ilem^ in Ulus 1/9) 


No 

C 

D 

B 

I 

S 

P 

Co 

Au 

//* 

1. 

048* 

026 

131 

252 

234 

766 

017 

-098 

738 

2. 

-079 

118 

075 

048 

572 

350 

- 235 

-045 

532 

3. 

250 

020 

.380 

067 

302 

- 117 

064 

183 

355 

4 

073 

679 

006 

181 

026 

-032 

-032 

467 

719 

5 

206 

129 

4S2 

- 156 

163 

121 

-034 

199 

363 

6 

251 

ns 

518 

077 

065 

032 

112 

119 

378 

7. 

-057 

282 

-068 

001 


128 

-043 

-067 

56^ 

8 

026 

200 

260 

- 089 

155 

— 055 

042 

467 

363 

9 

298 

271 

236 

- on 

-010 

-076 

OSO 

481 

451 

10 

409 

151 

083 

- 103 

248 

-046 

113 

316 

377 

11. 

808 

141 

- on 

010 

-057 

013 

-012 

274 

757 

12 

-045 

009 

003 

218 

708 

362 

-052 

002 

6S7 

13 

023 

018 

106 

221 

-054 

000 

075 

620 

447 

IS 

202 

098 

397 

498 

154 

012 

254 

075 

545 

16 

289 

- 125 

210 

221 

000 

- 106 

179 

168 

279 

18. 

001 

-013 

206 

003 

475 

206 

274 

020 

382 

19 

010 

097 

471 

307 

-057 

016 

244 

046 

386 

20 

002 

500 

067 

299 

420 

126 

- 361 

- 228 

718 

21 

142 

131 

080 

170 

479 

139 

-on 

461 

531 

22. 

262 

-on 

-001 

349 

-053 

-031 

248 

190 

296 

23. 

165 

066 

008 

- 090 

104 

203 

549 

066 

386 

24 

158 

200 

143 

142 

094 

102 

346 

176 

273 

25 

290 

001 

417 

-004 

- 134 

174 

-006 

342 

419 

26. 

387 

346 

310 

187 

-026 

052 

-036 

199 

4-15 

27 

-026 

033 

-009 

180 

278 

720 

149 

037 

6S6 

28. 

431 

560 

298 

-092 

099 

-069 

158 

190 

673 

29 

252 

033 

047 

-002 

063 

-057 

507 

403 

494 

30. 

- 140 

084 

020 

096 

319 

503 

328 

-018 

490 

31. 

513 

591 

200 

085 

116 

016 

181 

142 

698 

32. 

161 

714 

088 

035 

161 

-045 

100 

490 

795 

33 

045 • 

- 092 

080 

293 

4S9 

419 

- 018 

223 

534 

34 

100 

236 

046 

6S5 

499 

172 

-018 

-030 

766 

35. 

005 

040 

-022 

282 

184 

799 

- 036 

- 036 

760 

36 

-003 

022 

023 

109 

344 

184 

-019 

475 

388 

38 

147 

081 

067 

- 032 

214 

-041 

267 

-038 

153 

39 

145 

042 

041 

42S 

595 

113 

045 

209 

619 

40. 

317 

053 

236 

144 

306 

106 

196 

180 

364 

41 

849 

199 

038 

037 

014 

- 128 

-035 

246 

851 

42. 

-033 

066 

- 134 

583 

516 

049 

004 

-018 

635 

43. 

017 ■ 

-019 

228 

174 

083 

- 101 

-428 

172 

308 


* decimal points properly preceding each entry have been omitted 

(Mosier, 1937, p 283 By permi!»sion of the Editor, P^ychomelrika ) 


Homogeneous Tests 

At the beginning of this chapter the importance of ha^ iiig a single 
psychological function measured by a single scale was emphasized. 
Such a scale is not easy to prepare logically or statistically, because 
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differences in mental processes at various levels of difficulty and in 
the attitudes and personality traits are hard to observe or control. 
Several approaches are used in addition to the logical one Loevmger 
(1948) summarized these and has given an Index of Homogeneity 
which has the advantages of being rather simple to apply and uni- 
versal in its application Ferguson (1941) proposed a method deter- 
mining homogeneity of items Names of people were placed on a 
checkerboard, from left to right in order of increasing total scores 
at the tops of the columns, and items were placed in order of in- 
creasing difficulty in ascending rows. If the test is perfectly homo- 
geneous the pluses for correct answers made by each person for 
each item will lie above a diagonal broken line on such a checker- 
board. Items which do not measure the same factor will show differ- 
ent plus and minus patterns. Also, persons who are atypical of the 
group will show irregular plus and minus patterns in the columns. 
Guttman (1947) has described an ingenious mechanical device, called 
a Scalogram Board, for accomplishing the same thing that Ferguson 
achieved with his table, but with a smaller number of items In these 
analyses it is assumed that in a perfectly homogeneous test, two per- 
sons who get the same scores will have exactly the same pattern of 
correct answers. These techniques are similar to the process of cor- 
relating each Item with the total test score But they go further in 
showing clusters of items which are related by virtue of being an- 
swered in similar fashion by certain portions of the subjects. A more 
analytical technique is that of correlating each item with all the 
rest, as was shown above in hosier's work This method provides a 
matrix upon which a factorial analysis can be made. By a careful use 
of factorial analysis it will be possible to show the factor loading of 
each Item in terms of the factors that are revealed. From this analysis 
items that have a high loading in one factor and a small loading in 
other factors can be found. 

LIMITATIONS AND ADVANTAGES 
OF FACTORIAL ANALYSES 

The procedures outlined above, as well as factorial procedures 
generally, place fairly clear limitations upon the usefulness of a 
factorial analysis, for example. 

1. The method should only be applied when all scores in a par- 
ticular test can be shown to represent the same combinations of 
factors. 

2 The factors which are discovered represent patterns of be- 
havior which are independent only in the sense that they do not 
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usiialh apjjear in similar or related amounts in the same peisoiis. 

3 'I’hc factois may be thought ol as forces or aiiangcmcnts which 
aic found within or w’lthout the individuals Ihey may act against 
or ill the same direction aa other lorccs, noi in ptiiely addili\e com- 
binations alone. 

4 All (actors are, like highest common lactois, the most complex 
factois that aic common to the persons who ha\e been tested It is 
quite piobablc that most ol these (actors may consist of a number of 
smaller independent lac tors 

5 The number of common Jactois found in any battery of tests 
will vaiy with the items which are included and the abilities ol the 
persons tested Hence, even though two groups ol pcisoiis aic ap- 
praised by the same tesis, analyses of the two groups w'ill often yield 
different factor loadings 

Factorial analyses have the following advantages 

1 The\ allow a mathematical analysis of elements oi toices in 
tests of individuals according to specific assumptions The assurnji- 
tions can be \ariccl to fit experiincnial findings 

2 They sho’W wdiich tests involve se\cial factors and which involve 
a few factors or oiil) one The selection of tests or items which are 
pure measiiics of a paiticular factor is thus advanced Factorial 
analysis is undoubtedly one of the best methods ol evaluating an 
Item, and a great boon m the construction ol analytical scales 

3 'Ihey show, with statistical precision, the types of patterns 
w'hich are not found together in the same individual Fiom such 
anal}scs individual profiles can be constructed ol factoia w’hicli arc 
distinctly indepcndeiit in a paiticular group of persons 

4 'J hey yield analyses of all kinds of forces, intellectual, en\ii on- 
mental, and emotional, which aic operating in the test situation. 
Lxamincrs often lose sight of one variable w’hen they aic closely 
exaniimng another, hence a iacLoiial analysis is a good check on sub- 
jective analyses, and leads to a discovery of new' factors 

5 They can also be applied to job analysis, and job ratings, and 
thus yield moie elicctivc cnlciia of success 

STUDl GUIDE QUESTIONS 

] \Vh«il fve the two basic assumptions made in apphiiig a factorial 
analysis^ Which test situations most nearly conform to these assiiiiiptions*^ 
Winch least‘s 

2 How' may (ominon factors be defined’ 

3 Define correlation matrix, general factor, group factor, specific factor 
accord nig to Spearman 

4 What lactois did Spearman describe^ 
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5. Define variance, vector, score, orthagonal 'force, factor loading 

6. How may the total variance of any test be expressed in a formula 
which combines common, specific, and chance factors. 

7. Give examples of the tests most highly saturated by the primary factors 
isolated by Thurstone. 

8 Which in Ulus. 148 are unique? 

9. Which in Ulus 150 are unique^ 

10. How can factorial analysis be used to develop purer or unique meas- 
ures? 



PART THREE 


DYNAMIC PATTERNS 





CHAPTER XV 


PERSONALITY: DYNAMIC 
THEORY AND STRUCTURE 




This chapter first gives an exposition of the concepts ol pcisonalily 
common to almost all ilicoiies ol pcisonality. Ihcn four specialized 
theories, which usually supplement each other, arc discussed (^7) 
those having physical or physiological bases, {b) clinical theoiies 
based on studies of insane or poorly adjusted poisons, (c) psycho- 
analytical theory ivhich emphasizes the results oL studies oL sexual 
development, and (d) psychological iheories cmphasi/ing learning 
and social behavior. 


INTRODUCTION 

The evaluation of personal dynamics goes far beyond simple 
measurement, loi it seeks to make a complete picture ol vaiious 
component paits ol an individual and the forces winch acti\ate 
them. 

A human being consists physically of fiom four to five billion cells, 
placed in relation to each othei in complex lashion and forming 
several hundred someivhat independent leaction systems, whicli 
grow and decline at dillcicnt rates, and serve dillercnt bodily 
or social functions. The cells of the body are for ihe most part 
small and easily desiioyed and cannot be directly observed while 
functioning. In spite of some notable advances in knowledge con- 
cerning the fund ions of the brain areas the physical mechanics of 
perceiving, thinking, or feeling are still to be w'ell clemonstiated It 
seems that there is sometimes considerable elasticity and partial sub- 

12,) 
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stitution of function and of dominance among the various reaction 
systems, which further complicates the study. It is small wonder, 
therefore, that for almost any complex physical human behavior, 
fairly competent judges often have a difficult time agreeing on just 
what has been done physically, and a much more difficult time ex- 
plaining why It was done When competent observers can agree 
upon identifying and measuring the complicated patterns of be- 
havior and tlie causes of these patterns, there will be a well-developed 
science. At present we are still in the process of defining and identi- 
fying patterns. 

Among the many theories of personality there are certain recur- 
ring concepts that form common ground. Psychologists Edward L 
Thorndike (1935), Robert S Woodworth (1938), and Gordon W. All- 
port (1937), clinical psychologists Lawrence F. Shaffer (1936) and 
Robert W White (1948), psychiatrists Otto Fenichel (1945) and Sig- 
mund Freud (1938), and social psychologists Kimball Young (1947) 
and Gardner Murphy (1947) agree fairly well in using the following 
eleven concepts. 

1. The physical continuum. A human being's acts are closely 
related in a continuum of activity, which means that all of his acts 
are parts of a complex stream of action, and are caused by what went 
before, and, in turn, cause further action. 

2. Consciousness and unconscious activity. Behavior cannot al- 
ways be observed directly, but is inferred. That which can be ob- 
served by another person is called overt activity; the remainder is 
called covei t or inner activity. Part of covert activity is known to one- 
self, for example, a person is aware of multiplying 48 by 56 mentally 
and can report this activity. This type of covert activity is called con- 
scious covert activity. But other activities go on within us, of which 
we may have little or no direct evidence. They therefore cannot be ob- 
served or reported on. An activity of this kind is called unconscious 
activity. It can only be inferred from acts which are hard to explain 
by any conscious activity. For example, a student feels compelled to 
carry her gym shoes to her history class for no conscious reason. Un- 
conscious activity is sometimes given a subdivision, called subcon- 
scious activity, which includes those acts that we observe rarely, or 
can remember only under special conditions, such as hypnotism and 
great fear or excitement, or under a long intense process of recall 
such as occurs during psychoanalysis. It is this inaccessibility of inner 
behavior that has given rise to demonology, astrology, and many 
other schemes which purport to reveal and explam what is beyond 
the reach of the conscious mind. 

Most authorities agree that an infant is largely motivated by un- 
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conscious needs, some of which may coiuiniic Lhicughout li(e. Also, 
unconscious wishes or feais may be acquired b> scveiely husiiating 
or unpleasant cxpci lences 

3 Inhoitcd and eiiju onmental factois. An individual stai Ls life 
as a single cell which results from the compleK union oL two othei 
tells. The substances in the single cell giow or cnlaige b\ assimilating 
animal and vegetable substances from outside the cell Rate and t\pe 
ol: enlargement aie determined both by the substances in the cell and 
by the substances which it assimilates Thus even at the start of hie 
the influences of heredity and envh onnient arc (oinbiued 

4 Difjeientiatioii of cells The original single cell soon subdi\ ides 
lapidly At first all the cells seem to have the same inteinal qualities, 
but soon it IS apparent that their location m the body determines 
their specialization of function Due to the pull of giavity or to other 
forces, one portion of the einbiyo develops more rapidly than the 
other and becomes the head A gradient of activity is established fioni 
the head to the tail The outci cells develop into skin, sense organs, 
brain and ihe central nervous s)stcm The middle layers of cells 
develop into bones and muscles attached lo the bones, and the in- 
ner cells, into the viscera. A few cells which remain isolated in the 
gonads or repioductive organs arc the germ cells. An iniant at birth 
acts only in comj^lex patterns of surging movements As nerve and 
muscle hbeis grow more mature, small groups of muscles are able to 
make indej^emlent movements, and a laigc number of new combina- 
tions of iiiovemciits are possible. By adulthood these become fine 
dexterity and agility 

5 Mntinalion and senrsence At birth growth is very rapid, but 
the piocess gradually slows down until in the adult theie is no fur- 
ther increase in the number of cells ThroiigJrout life there is a 
continual replacement of woinout cells until old age when the re- 
placement picxcss giadually declines in both quality and quantity 
of cells The piocess ol giowth is regulated in part by the innate 
character isiics of the cells and m parr by the amounts ol mineral 
substance^ thev retain A number of internal glands also regulate the 
maintenance and grow’th of various parts of the body. 

6. Individual dilleiences. Persons differ at birth in the relative 
maturation ol various parts of the body The alssolute differences 
increase until maturity is reached, the relatrve differences are harder 
to deteinune At maturity the smallest individuals are from one 
half to one thud the size of the largest ones Many psychologists be- 
lieve that relative differences m mental abilities arc consideiably 
greater. 

7. Learning. Shortly after birth a type of learning railed condu 
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tioning is common This is a process whereby stimuli begin to elicit 
reactions which were not called out at first. From time to time these 
stimuli were part of reaction patterns which they now recall in part 
Thus the wotd book comes to represent a type of object book by 
their being experienced together under certain conditions. Num- 
bers, symbols, and feelings of fear or anger are associated in thousands 
of combinations 

8. Adjustment patten ns. In an infant four adjustment patterns 
appear, (a) seeking, when the infant is restless and apparently needs 
something, (b) satisfaction or pleasure, when his wants are met and 
the infant relaxes, (c) dissatisfaction or anger, when his wants are not 
satisfied promptly enough and he becomes violently and destructively 
active, and (d) fear or anxiety, when he is overwhelmed by some real 
injury or imaginary injury and withdraws, if possible, to a safer 
place These adjustment patterns are sometimes called emotions, 
but the word emotion has so many meanings that it should not be 
used without qualification. 

9. Needs. It is from these adjustment patterns that one’s needs, 
conscious or unconscious, are known. A person is moved, or motivated 
to action, by his needs. In an infant needs or drives are nearly all 
related to his physical well being. Soon thereafter he develops great 
pleasure in immediate friendly activities. In adolescence satisfactions 
come to be related to long-term activities, such as a building project. 
In adults, as ideals are developed more clearly and consciously, these 
long-term activities become still more dominant sources of satisfac- 
tion. 

10. Conflicts. Because of all these different needs and drives, con- 
flicts of many sorts appear. One may have a conflict on the physio- 
logical level due to the need for a drink of water and the need for 
additional rest. Or a physiological need may conflict with a mental 
or social need. For instance, one may need exercise but also wish 
to finish listening to a lecture on economics. Sometimes a conscious 
wish may conflict with an unconscious one Thus I may wish to be 
considerate of a person, but at the same time have a subconscious wish 
to get rid of him. Likewise conflicts frequently arise between imagi- 
nary and real satisfactions. For example, a person’s day dreams and 
fantasies may become so pleasant, in contrast to the sordid realities 
of his life, that he will shut himself off from reality and become dis- 
oriented. 

11. Modes of solving conflicts. Authorities differ considerably 
over theories concerning the solution of conflicts. At one extreme 
are the mechanists who believe that a person’s activities are all the 
result of natural laws which govern mechanical forces. Thus when 
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there is a conflict, the aLtion taken is always the icstilrant of the 
various forces prc'^ent at the moment. At the other cxtitme are those 
who belic\c that some biipcrJuiman being can put aside natural laws 
and bring about results which del) mechanical |)nncipies Some- 
where in between these who hold cxtieine beliefs aie those who 
believe that they opeiaie according to iiaiiual law's but that some ot 
these principles are not w’ell uiidci stood, and that they can inlliiencc 
or direct iheir own actions a good deal by a piocess o( w'ill oi choice 
Most psychologists explain choice as the result oi an inhented and 
learned icadiness to icspond to certain situations in a parlictilar 
way. This readiness is sonieLiines c«illcd an ideal or atLitiiclc (Chapter 
XXI). Thus some persons think of themselves as perfect and impor- 
tant and choose immediate and selfish actions, blaming others for 
all their lailiiics and dissatisfactions Otheis think oi themselves as 
weak and impericct and blame themsehes or iccl guilty oi anxious 
about many situations. Many take a middle position Ihey admit 
some weaknesses but also realize then strengths, and tiy to analye 
the causes oi then behavior and to plan constructixe and coopeiativc 
behavior. 

From consider cU ions such as lliese a number of authors ha\e 
roughly classified persons accoi cling to how well thc} solve then con- 
flicts. Four levels of adjustment aie clcsciibed licie 

Good ad}mtment. The w'cll-adj listed adult has axciage oi stiong 
energy and physiological impulses, balanced by a w’ell-dcvclopecl 
set of ideals for himself and society His energy is directed to allow' 
adequate personal satislactions and to coiitiibiite to the reasonable 
satisfactions of those about him He is closely in touch wuth reality, 
but has a good imagination 

Fair adpi^hiicnt This person has strong impulse^ which oc- 
casionally break thiough into aggressive or loolish action, but usu- 
ally his ideals, w'hich aie not veiy well thought thiough and organ- 
ized, channel Jus behavioi into actions which are socially acceptable 
While he is fairly w'ell oriented he is not objective about some 
superstitions and leais 

Poor adpistvient Tw’o types appear here, both ot which have 
many fantasies which seem quite real 

a. One tyj^e has w'eak impulses and strong social ideals The com- 
bination results in anxiety and com|3ulsive Irehavior 

h* The second type has such stiong iinpulses that he reads with 
violent aggicssion ot uncontiollable fears His ideals for himself and 
society are \ariable and pooily defined. 

Disorient niton Tw'o types of persons constitute this class 

a. One type has lairlv normal impulses and thinking ability but 
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has largely withdrawn emotionally and lost touch with reality. A 
person of this type lives in his own imaginary world. He may imagine 
himself as including a large part of the world, as in some catatonic 
states; or he may think of himself as most normal people do, but 
imagine that other people and things have his thoughts and fears, 
as in paranoia. 

b. The other type has lost touch with reality through a deteriora- 
tion of mental processes due to such causes as great fatigue, disease, 
excitement, drugs, and old age. These people are confused about 
themselves and society, and their impulses are also usually impaired. 

PHYSICAL BASES 

Any appraisal of human behavior must somewhere attempt an 
analysis and quantification of bodily form. Sheldon (1940) sum- 
marized the previous work of about thirty investigators and reported 
a detailed study of bodily dimensions of four thousand white adult 
males. He found that measures from photographs were more reliable 
than measures taken by calipers of the soft parts of the body. He 
devised a method of photographing persons with a standard 5- by 7- 
inch portrait camera which was placed at a standard height and dis- 
tance from the subject, who was on a pedestal at a standard distance 

ILLUS. 151. SOMATOTYPES FROM SHELDON 


A, Extreme endomorphy . 

(From Shelddii, 1940. By permission of W. H. Sheldon and Harper & Bros.) ‘ 
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lULUS. 151. SOMATOTYPES FROM SHELDON {ConVOf 



C. Extreme ectomorphy 

(F^ofti .Sheldon, 1940. By permission of W. H. Sheldon and Harper it Bros.) 
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from a lined neutral-gni) backgioiind ItMvas thus pobsible to lake 
full-length pictincs sufhciently Ircc Iioni ])liotogiaphir rhstoi tjons, 
whose incasurcnieiits agieed highly with those taken horn the living 
body The photcgiaphic nicasiucs had ihc advantage of yielding 
indicators of tvpical postuie anti ciiivaiure Ho^vcvci, actual incas- 
ines of height and weight were alwavs taken 

Sheldon stated that Iroiii inspection ol the four rhoiisand subjects, 
thiec, and only iliiee pionounted exiieiiies siood oiii, which coi- 
ies])oiidcd rouglily to Kietchmer’s thiee I) pcs ol jj)knir athletic, 
and asihenic pcisons Sheldon, however, lejccted Kietchrner's lei- 
niinology betaiise he w shed to make cei lam important changes in 
definition, and to show a i elation between body strnctuie and i da- 
tive dominance ol thice Lncis ol tissue His thiee body types, called 
aie dcsciihcti in Ulus lalandJllus 1‘32 ^ihe cneJomo}- 
plnc type is i elated to dominance ol the inner la)eis of cells in the 
embiyo which h) adulthood have develojjed into the vnscera (Illiis 
151 A) Ihe inesottK)} phi( type ahow'» relatively grcatci dcvelojDment 
of the mesial oi middle la)er oE einbiyoiiic tissue which glows into 
bone and voluntary iiiusclcs (Ulus 151B) and the ec tomo'^phic to the 
outer layci which is the basis for brain, nerves skin, and most ol the 
sense organs (Tllus 15 1 C). 

The mam criiicisiiis of Sheldon’s work arc that the rhicc types were 
established by inspection and logic, and that his latci work may have 
been too much influenced by selection oi evidence to support a 
iheoi). Sheldon’s basic measuic'ments, however, aie sound, and future 
icseaich can moclilv his liypothesis il modification is needed One ol 
the difficulties ol demising morphologic scales is caused by the fact 
that even in a large group lew persons are found m the extreme 
categories, and the mixtures present complex combinations. 

After use of elaborate lank-order procedures based on specific 
mcasuieinents, Sheldon devised a tiipolar scale in which each ol 
the ihiee bodily types is represented on a 7-poinl scale, using 7 for 
the most extreme ioi'm, and 1 lor its almosi complete absence The 
steps in each scale are identified by moiphological indices and photo- 
graphs Jn this scheme a 711 type would have 7 jioinis on the ciido- 
moi-phir scale, and one each on the other scales A 4-14 tvpe would 
represent an equal blending ol all three types (Ulus 155) Ol the 343 
mathematically possible combinations of types, 76 liav'e been found 
and desciibcd 

lo aid m classification, the body was divided into five regions for 
which seventeen s(*parate measures were taken (Ulus 153 and Ulus. 
154) Each region yielded its own morphological measures These 
measures combined gave the general morphological type and also 
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yielded eMdence ol the amoiiiii of disagreement, called dysplnsia, 
among the h\e legions 'Fhe amount ol dysj)Lisia was calculated by 
summing the dilleienres ljcn\ecn the live legions Ihe total de- 
ference in the sample (Jlliis 151) was 20, which Sheldon found to 
be about the 96th centile in the adult male sample, the mean of 
which was 1 0 22 1 he dysplasias ai e icportcd to be moi e impoi taut in 
seveial mental diseases than the total body L\i)e 


111 IS 1>2 niSCRIPJION OF SHI LDON S SOM\TOT^rrS 


Body 


Neck 


/ ndomo) fjliir 
Round, soft, no mus- 
cle due to suheuU- 
iieoiis lai, liont-to- 
back and sule-to- 
sicle d tamciei s abou t 
ec]ual in head neck, 
Li link and limbs 
Long Li link and large 
\oIuiiic l*scudo- 
bicast in male, but 
abdomen Iniger 
than Lhoia\ 

Short, obtuse angle 
with chin 


Me^omm ldiL( 
Scjuaic, hard laigc 
muscles Some fat 
Latcial diamcLcis 
much giearci than 
aiiLciopONtciioi 
1 h()ia\ laigc'i than 
ahduiiien.and u idcr 
at shouldcMs than 
uaisL broad hips 
Flunk ma\ be long 
Ol short 

I ong, huge muscles 
on sides 


Face W idc, hoih lower and 

uppiM , cais and 
nose Hat 

Head Lai gc, spherical 

Limbs Shoit, tapcimg, ueak, 

small hands and feet 

Vertehia \caiK sriaight 

Gene- Small 

talia 

Skm Sott, \ civet) , smooth 


Bones Small 


La rgc c) c‘h i ou s , ( h eck- 
boncs and piws 
Bones and muscles 
piomincnr 

Cubical 

^^as'*lve and stioiig, 
large liands, large 
loinis 

Slight S, boixing ill 
at linn I rar region, 

Well-de\ eloped, com- 
pact 

Thick, coaisc, easily 
tanned, wiinkLcd, 
laige poies 

Laigc, thick, heavy 


Hair Medium amount, fine \ ana hie in a mount 

icMiiie piema- C o.irse, hald in fionr 

tuiel) hald on 
back of head 


/•clomarfjhic 
Thin, shouldeis droop 
and aic nail cm, 11 unk 
abdomen short and 
shallow rhora\laig- 
CT but shallow' 


Slender, bends forward 

^mall, lean features 
Cl ’ 111 recedes I'pper 
e.us pioject 1 cMigth 
iiioie I ban width 

Lnige ciaiiiunr 

I ong e^peciallv in dis- 
tal segmenis long fin- 
geis and iocs 

.\raikcdS 


Floiigated 


Ihm, (li\, fine wi in- 
kles, does not tan 

Small, delicate, promi- 
nent 

Fine or \cr) fine, Iralcl- 
nes-» raic, hard to 
coinh 


(Adapted f I oin Shc’ldoii, 1010 Chap ITT Bv permission of W H Sheldon 
and Haiper Bios ) 
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ILLUS rOCATIOXS for mi VSURING SLVENrKFN DIAMintRS 



(Fiom Shddon, 1910 By pciniission of W H Sheldon and Harper Bros) 

Two reports of factorial analyses of bodilv mcasincs should be 
considered here, one by 'riiiirsLone (1946) and the other by Burt 
(1917) Thill stone used on one hundred male adulls, ten measure- 
ments w’lucli had been made by Hammond The measurements in- 
cluded 2 of height, standing and sitting, 3 ol the hand span, length 
and breadth, 4 of the trunk, shoulder, chest, and hip breadth, and 
chest depth, and 3 of the head, breadth, length, and height He found 
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ILLUS SHM.DON'S Rrf,IO\S TOR MORPHOl OGIC^L 
ML \SnRLMLN'l 



Jir^ion 

Sample 

l 

TTcad, Imcc, and Neck 

SGI 

11 

1 lioiaric 1 milk 

2'>2 

111 

Anns, ShoLilcU'i, and LLinds 

4-)l 

IV 

\hdoinin('il Tzunk 


V 

Lcj»s and I oci 

Medians by Loliinin 

3:)2 

352 


(B\ pnmission of W IT Sheldon and ITaipci & Bros.) 

four primary factors which were ronccinecl with bone length, head 
size, girth, and hand si/e 

Burt, who began studies of thii kind dO years ago, suimnarizcd a 
number of studies in 19*17, all oi w'hich viekled, arcoiding to Burt’s 
‘‘simple summation” method, a general factor of si/e, which can be 
expressed as a si/c quotient analogous to an IQ 'rhis size quotient 
has high predicts e value, and coi relations between siblings ol about 
.50 were repoited among London school children Among English, 
Irish, Welsh, American, and Jewish adult males, factor patterns were 
found which w^ere similar to those in a sample ot 528 Royal Aii 
Force men. Using seventeen physical measurements, a general size 
factor appears which is about as large as all the other factors to- 
gether, When this is removed two broad group factors cmeige, one 
associated with weight, and witli neck, w’aist, and thigh girth, the 
other with height and leg length. Then two pairs ol run rower group 
factors emeige as subdivisions of each broad group factor These are 
most closely related to trunk girth, limb girth, sitting height, and 
thigh length 

These factorial studies support Sheldon’s analysis of types as far 
as they go, but they do not include manv ot the \ariables that 
Sheldon used, and the statistical treatment docs not yield composite 
or pattern scores of any kind It seems probable that more research 
in this field will show fairly stable physiological patteins related to 
various temperament patterns (Ulus 215). 

CLINICAL SYNDROMES 

Another approach to the study of behavior is that of psvehiatrists 
who follow Kraepclin (1895) in his description of clinical s>ndromes. 
SyndTome means literally “a lurining together,” and in this instance 
it refers to a combination oL symptoms or patterns of behavior. 
Syndromes are not easy to describe or to agree upon Since the same 
patterns ol behavior may appear in several syndromes, they are not 
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ILLUS. 155. SHELDON’S GYNANDROMORPHY SAMPLES 



(From Sheldon. 1940. By permission of W. H. Sheldon and Harper & Bros.) 

Two individiials of different somatotype showing different degrees of gynandro- 
morphy. The individual on the left is of a somatotype, 523-524, normally high in 
gynandromorphy- The individual on the right is of a somatotype, 262-172, nor* 
mally low in gynandromorphy. 
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ILLUS. 15i>, SHELDON’S- GYNANDROMORPHY SAMPLES {Confd) 



(From Sheldon, 1940. By permission of W. H. Sheldon and Harper & Bros.) 

Two individuals of the same somatotype, 442, showing different degrees of gynan- 
dromorphy. These physiques are of the same somatotype in all regions of the 
body except the second (thoracic trunk). The individual on the left is higher 
and the one on the right is lower in gynandromorphy, than is the average 442. 
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cleaily independent ol one anotlier Syndtomes can be ideiiLified only 
by studying a pei son’s behavior over a period of time Clinical syn- 
dioiiies may be roughly divided into two groups 

1 Syndromes which involve serious disorientation over long pe- 
riods and usually lead to mental deteiioranon are called psychoses 
They are soinetiines caused by alcohol, diugs, and disease or in- 
jury, but usually physical causes are thought to hasten the psychosis, 
rather than to be the basic cause. Nearly all of these synch onics lead 
to hospitalisation Theii beginnings are seen in failure to adj’ust 
to persons, which continues and becomes serious. Basic needs lor 
secuiity are met by rejection, and drives lor powei oi sexual satisfac- 
tion conflict with desnes for fiiends and with socially accepted be- 
havior In one type of psychosis, called demejitia p)ciecox or schno- 
plnciiia, people w'lth laiily active minds withdraw lioiu actual situa- 
tions into a world of fantasy, often tinged with thoughts of pcisecu- 
tion and exaggerated scU-iniportancc, called pniauoia. The more 
these poisons become engiossed in their ow'ii thoughts and wishes, 
the rnoie they become unable to observe clearly w’hat is going on 
around them They become unable to give an arcinate statement 
concerning w'ho or wheie they aie {disoriented), and they accept as 
true their beliefs which are patently fdlse (^delusions) Ihcii peicep- 
tion is also affected so that tlicy make false recognitions of common 
objects (illusions) or imagine objects to be present and accept these 
visions as facts (h a Hue in ai ions) 

Anoiher common type of psychosis, called rnanic-depiessioe in- 
sanity, IS a gloss exaggciation of normal mood swungs A pci son ol 
the manic-depressive type icacts to stimulation oi hustration by great 
excitement, elation, undertaking too many things, and rapid activity 
Extreme cases in which theie is great impulsiveness and, later, lack 
of memory are called manias These states may last months oi years, 
but usually the person is w’orn out after a few weeks and goes into a 
depressed phase when he is gloomy and despondcni, and sometimes 
in a coma w’hen he cannot be aroused by the usual means 

2 Syndromes w’hich involve p^iiial disfunction aic called psy- 
choneinoses These may be described as exaggerated bad habits They 
are thought of as poor adjustments, wdiich, howevei, often prevent 
further breakdown by giving sufficient protection to the person in 
ordinary situations. Thus, some persons worry c\ccssi\cly about 
bodily functions (Jiypochondiiasis), but the worry in itsell gives some 
satisfaction and may prevent more serious maladjustment In others, 
feelings of guilt and fear of the fiuuie aie converted into involuntary 
physical symptoms-— paralysis, anesthesia, or cardiac or digestive dif- 
ficulties (con iiet Sion hysteuas) lly^Lciical symptoms oltcn last for 
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years, since they usually gam sympathy and protection for the in- 
dividual. They may be ia])idl\ ctned by strong emotional convictions, 
such as faith healing, but the ciiies aie not likely to be pci manciit un- 
less the sick person cichic\cs sufficient security and insight 

Other persons develop unreasonable iears (phobias) about particu- 
lar situations or objects as a icsult ol shocking and embarrassing situa- 
tions. While a phobia may be Inirdcnsome, it may ha\e protective 
value in preventing the peison from getting into the same or a 
similar situation. Other peisons pci form acts ^Mthoiit knowing why 
or having good reason, called obsessive or compulsive acts (psychas- 
thenias). In some peisonN discouragement and lack oi scl 1-con fidcnce 
reach a stage of severe depiession and suicidal tendencies, or, their 
conflicts may lead to a conscious disiegard of social responsibilities, 
and to lying, stealing, alcoholism, and nuniorality. Ihesc actions 
characterize the psychopathic pownahiy 

Each of the above types may be independent ol the oiheis, but 
sometimes tliey are combined, piobably reflecting social prcssuics 
and weaknesses m se\cial aicas All arc basicall) types ol disoigani/a- 
tion resulting from conflicts and Icchngs ol Irustiatioii. 

Many competent psvthiatiists have pointed out that wdiile these 
rough classifications of syndiomes aic convenient, they aic not easy 
to use accurately. They are eithci too \ague or too elaborate to be 
used in understanding ba^ic causes. 

PSYCHOANALYTIC THEORY 

The psychoanalvtic theory of personality structure as described 
by Freud (1938), Ilealy, Bionner, and Bowers (1930), and lenichel 
(1945), postulates three structural parts which aie not cleaily sepa- 
rated, but nevertheless have distinct functions (Illus 3) None of 
these parts has a specific location in the body, and all depend upon 
brain processes which aie, ol coinse, related to sensory and motor 
processes. The largest part, called the idj is the reser\oir of instinctive 
impulses, some of which appeal at biirh and some latei The id also 
contains memories of experiences or wishes which liave been lorccd 
out of consciousness (lepiessed) 'Ihc stiucturc or content ol the id 
can never be known directly, but is inferred from vaiious irrational 
acts and from dreams The id continually drives a peison by impulses 
for personal pleasure called libido^ which include sex and piotcction, 
and by impulses for deatli called moitido 

In a newborn infant the id is the only coiiiponent of personality. 
It consists of inherited needs or drives and of instincts which unfold 
at later periods. The id is the sotiice ol nearly all the internal drives 
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a person has throughout life As perception, memory, and reasoning 
develop, a second component, the ego^ or the conscious self, takes 
shape. The ego includes ideas and ideals of one’s own physical and 
mental abilities, as well as attitudes and knowledge of the environ- 
ment l^he ego receives impulses from the attitudes of the id, and at 
the same time is aware of pressure from the environment. The ego is 
a cool dinator. In maturity the various id drives are usually merged 
and channeled by the ego into socially acceptable activities, but in 
earlv life and also in various stress situations the ego may be ineflEec- 
tive with disastrous results. 

At an early age a third component of personality gradually de- 
\ clops called the siipeiego. It is defined as a strong unreasonable set 
of prohibitions learned in childhood, the prohibitions of parents and 
society Some of these prohibitions are later seen as reasonable and 
become part of the ego along with the constructive ideals for society 
which one accepts. This part of the ego corresponds roughly to con- 
science 

Serious personality deviations occur when any one of the three 
components — the id, ego, and superego — becomes too dominant. 
Thus, when the superego, representing oppressive restrictions of 
society, real or imagined, is too strong, the person feels anxious, guilty, 
and fearful. When the ego becomes too dominant, the person may 
withdraw from society and also retreat from his own id impulses, or 
he may try to be a dictator When the id is too dominating, the per- 
son IS too impulsively aggressive, lustful, gluttonous, and unin- 
hibited 

These definitions of id, ego, and superego give meaning to what 
are called psychoanalytic mechanisms. These are unconscious pat- 
tciiis of behavior which resolve conflicts between the id, ego, and 
sujjciego. Thus the repression mechanism is the unconscious process 
of resolving a conflict between the superego and the ego by driving 
the conscious wish or need into the id Repression serves to protect 
the conscious ego from unbearable pain, but it also leads to all 
kinds oL serious maladjustments, for the repressed wishes, like prison- 
ers, tiy to escape from the id. 

Piojcction IS the process whereby a person attributes to other 
people or to objects his own unconscious (id) drives. These drives 
which have been repressed are projected into the environment 
and are regarded consciously as belonging to the external world. 
(This concept of projection is similar to but not the same as the more 
gcneial word “projective” as used to describe tests in Chapters XVII, 
XVJII, XIX, and XXIII, where projective is used to describe any 
behavior in which a person reveals that he believes that others have 
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the attributes skills, and* inoti\cs Tvhirh icalh belong to hinibcU) 
The list ol psychoanalytical mechanisms is long and coniplcN: One 
should coii'iiilL a special text (Ficud, 1038), il inteicsied jn liiither 
definitions 

Among psycliiatiists and clinical p:>\cliologjsts, one ol the most 
widely accepted ihcones ol peisonalii\ deselopmcnt is KiciicFs |)sy- 
chosexual theoiy Freud uses the ^\ords looe and se\ to co\ei a ^vjclc 
vaiiety ol aclisiticb oL \auous pails ol the hod) Mmost all types of 
pleasuic or satiblaction aie consideietl to gi\c sexual oi erotic sat- 
isfaction, because they seem to be closed) associated, at least in early 
childhood A biicf outline ol the usual stages or levels ol develop- 
ment £olloi\s 

a At inrth and foi a leu months theieaftcr the inlant is in the 
oral erotic stage when its duel satisLu tions come Ironi the mouth 
region. 'Ihe mothei’s body is the fust object to be iccogni/,ccl and a 
strong attachincnt for her develops Fixation at this stage or later 
regression is indicaLcd by intense satislaction liom oial sLinuilation 
h. The lust stage noimallv develops into a second, called oial 
sadism, Heie the infant gradually iccognizcs vaiious objects and 
seeks to incorpoiatc them into lijs own body oi to destroy them 
This stage is usiiall) lelated to ueaning or loss ol the biea^t, and the 
activity is at first directed toward the mother Fixation at this level is 
thought to be related to manic-depressive disordcis in later lilc. 

c During ilic second and ihird years of lilc the pleasure and sat- 
isfactions involved in clelecation become nioie prominent and iioi- 
mally supplant oial satislaciions to a laigc extent “I'liis stage is called 
anal eroticism It is usually accompanied by or lol lowed by stages in 
which a child may seek to control his parcnis or eiiviionment, called 
anal sadism or anal cxpulsuene&s I he child mav smear hiinscli and 
others oi their propeity with ieces Or, he may develop a?ial maso- 
chism where he punishes himself b) retaining teces, the expulsion 
of which uould normally be a pleasure Fixation at the oial maso- 
chism stage IS piobably related to miserliness, constipation, and ob- 
sessive compulsions 

d. In the thud to seventh yeais the eaily genital oi phnilir phase 
becomes jirominent Mampulatioii of the external genitalia becomes 
a marked soiiicc ol pleasure Boys and gills leali/c the dilfcicnccs 
in the structure of then external oigaiis and react by an inherited in- 
stinct, called the Oedipus complex Bo)s retain the attachment to the 
mother while developing feelings oi nvaliy toward the lather and 
consequently fcai ol castration by the father Giil^ develop a strong 
attachment to the father, presumably trom the clisajrpoiiilmciiL ol 
a lack ol penis Joi which the mother is blamed Girls have Lhciclore 
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a more clifTicult transition than boys, in that they must turn away 
from their inotheis; on the other hand they do not suITer Icais oL 
Ccisrra Lion 

e From ihe seventh to the U\elftli yeirs theie is ubually a jiciiod 
111 psychosexLial de\elopmeiiL ^\hcn ones chiei love object is one- 
self I'liis love of self is called fiauisum? A strong fixation at this 
stage IS related to inasturhation, wishLul di earning, \viilidrai\al Iroin 
reality, and schi/ophrciua The narcissistic stage is often lollowcd 
by homosexuality^ which means that the love object is another pei- 
sori of the same sex. 'Ihis is paiiiculail) common in gioiips made up 
of the same sex Dining this period social and ethical attitudes to- 
vvaid sex are gencially learned, as v\ell as the nature of biith and 
something of the sexual act Also, there is a rapid development ol 
piohibitioiis, or taboos, as well as ol constructive ideals loi sell and 
socieiy 

/ In pubcitv, the Oedipus complex may have a re'jurgcucc, but 
normally social taboos ol incest and a v^ancty ol social contacts lead 
to sexual adjustments with a partnci oi the opposite sex, called 
heierosexuahiy Failure in this adjiistnient at this stage lestilts in 
regression toward a narcissistic lovc-object 

The adolescent usually identifies self with the more successful 
lival In this identification the frustiated individual irihojects, or 
sets up wiihin himself or heiself the prohibitions of the parents, 
usually a mixture of both, but gives more heed to the iival patent 
III addition to these negative oi punishing jnolnbitions there is 
normally developed a positive set of ideals made up oi the parents 
good wishes and socially acceptable aspirations 

'Ihe normal sequence of psyclioscxual developineiil may be dis- 
tuibed at any level, and the person may i egress to an earlier stage or 
fixate j that is, remain whcic he is by failing to pi ogress to the next 
stage. '1‘lie most hequent causes of such disturbances arc thought to 
be deprivation:* or ovci indulgences b\ the parents, v\ho thus tend 
to induce fcais, anxieties, and icpicssions in then children \nother 
serious soincc of maladjustment ma> be sibling iivalry, which is due 
to feelings ol rejection The child may show hostility towaid parents 
or the sibling, oi lie may leact with feelings of anxiety and v\ith- 
drawal A third source of maladjustiueiit may be physical illness or 
glandular imbalance, which make one mentally dcjjendent and 
physically unable to cope with the demands of iiormaJ situations 

One of the lecuncnt niticisms oi Freud’s theory ol personality 
takes exception to his insistence upon the iiiflueuccs of specific in- 
stinctive patterns and of very early cxpeiiences Harold Orlansky 
(1018) reviewed 149 ai tides in an attempt to discovci relationships 
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between adult character and very eaiJy experiences He discusses at 
some length the effects oL boLile iecding versus breast leeding, the 
length of breast feeding, tecding on sell-demand veisiis scheduled 
feeding, the duration of the weaning pi ocess. and tiuiinb siitking lie 
concludes that there is no single lacioi or gioup ol lactois which aie 
closely related to various *> 01 1 » ol peisonality faciois in Luei life. 
More specifically, he finds that 01 al depmaiion may 01 ina\ not be 
related to other factors. Thus many repoiis arc toiind ol thumb 
sucking among children who had iinlimitecl niiising on scU-deniand. 
Mothering, cuddling for waim piotectioii, and sphincter training 
are likewise found to have little estaldished iclationdiip to larei 
personality structure The theory that too eaily bowel training leads 
in extreme cases to limiting all pleasant acts, becoming pai vinioiiious, 
stingy, meticulous, punctual, scll-restiaining, and sadistic linds little 
evidence in many of the caicful studies E\idcnce is loiind that the 
attitude of parents and the cultural piessiircs applied in early child- 
hood, pre-childhood, and adolescence, or even in adulthood are more 
important in character foiniation than very early expeiiences 

On the question of frustiation at an eaily age, Dollaicrs (1939) 
definition of frustration is put foiwaid, iiamel>, that liustiation oc- 
curs when a goal response sullers interruption or an insighilul goal 
response is interfered with According to ihis theor) fiustration 
does not follow from restraint but occurs only when some laiily well- 
established goal activity is inteiriipted Fuitherinoic, frustration is 
often tolerated by persons lor a long peiiod ol time and mas not 
lead to aggression or to seiious dissatislaction il other fonns of satis- 
faction are available. Witliout adequate controls conflicting theories 
blossom. For instance, the ciadle-boaul restiictions ol Southwest 
Indians are said to have induced passivity, aggressiveness sadism and 
cruelty, and stoicism and toler ation 

Orlansky believes that rigid character structuiing is impossible 
during the first few years of file because ol tJie nature oi an iniam’s 
organization and the rapid loigetting of experiences He points out 
that characteristics are radically changed at various ages and that the 
continuity of personal behavior may be clue largely to the continuity 
of an environment rather than to early conditioning Specific pat- 
terns of restraint do xiot have specific psychological impact on the 
child. Any discipline’s effect is related to many other forces — parental 
attitude, organic constitution, later social conditions, etc. He thinks 
that adult personality is not the result of an instinctive complex of 
drives mechanically channeled by caily discipline, but that it is the 
continually changing product ol many experiences Orlansky, thcie- 
fore, believes that the Freudians have greatly exaggerated the early 
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efEects of childhood experiences on the development of personality. 
He points out that one of the principal difficulties in understanding 
reports is due to the failure of most workers to define personality in 
terms which can be clearly understood. 

OTHER PSYCHOLOGICAL THEORIES 

There seem to be three other important psychological ajDproaches 
which contribute to personality theory the factorial approach is a 
logical and mathematical attempt to describe the parts or factors 
of a person, and the way they work together; the configumtion or 
topology gestalt approach contends that a person is a complex con- 
figuration of forces operating with reference to an environment 
which is also a complex configuration, a third approach, called 
the personalistic approach, emphasizes the fact that gestalts, per- 
ceptions, factors, and skills are not things or entities, but only the 
ways a person acts. A complete description of these theories lies be- 
yond the limits of this book The factorial approach has given rise 
to a great deal of measurement activity which is described in con- 
siderable detail in Chapters XIV and XXII. The other two ap- 
proaches are illustrated in the observational and projective techniques 
described in Chapters XVII, XVIIl, XIX, XXIII, and XXIV. 

USES OE MEASURES OF DYNAMIC PATTERNS 

Measures of dynamic patterns may be used with normal groups, 
with clinical groups, or to test hypotheses with regard to personality 
growth and structure. 

Normal Groups 

The large majority of studies of dynamic patterns are probably 
made within what may be called normal groups There studies are 
carried out by educators who wish to know the extent to which goals 
of personality development have been achieved — goals such as in- 
terest in social improvements, reasonable vocational choice, atti- 
tude toward various kinds of community institutions and minority 
groups, and the like Emphasis upon the development of good per- 
sonality and judgment through school programs has enormously in- 
creased due in part to the availability of these measures m elemen- 
tary, high school, and college 

A similar development has taken place in industry where human 
motives are recognized more and more as vital to production. A high 
degree of willing and intelligent cooperation has been demonstrated 
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to be one of the most eflertive wavs of solvini^ procludion pioblems 
and problems involving social attuiulcs I he nicasuicmont of dy- 
namic traits IS used both ioi the evahuiuon ol employees ol manage- 
ment and fonts policies Tmpoi taut studies aic now being earned out, 
in which the success ol a u thou Lilian vcisus clcniociatic pioccduics 
in industry IS being compaied Mihr^iv estalilishiiieiKs in this coun- 
try and abroad have made millions ol raiings of attitudes and pei- 
sonality traits oi soldiers and ofliceis, and aic now in the pi oc ess ol 
evaluating the icliability and useiulness ol these laiings 

Another high!) impoitanl use oi nicasines ol d)naini£ foices is in 
sociological studies oi atritudcs tow aid niinoiiiv groups, towaid 
community and religious gioiips and politics, and lowaid policies 
of various governing gioups 

Lastly, there is a beginning ol mam impoitant anthropological 
studies among iioimal gioups Racial clillcicnces have been ol con- 
siderable concern to those who look Ioiv\aicl to v\ oriel peace. The 
temperament ol the Japanese and ol some Southern Pacific gioups 
have been studied as v\ell ns the tcmpeianiems ol many groups in tlic 
United States and Lurope Anthiopological studies may piove to 
be an extremely important factor in sticcessiully planning coopera- 
tive self-government on an international basis. 

Clinical Groups 

The use in clinical practice of staiulaicli/cd appiaisals of ability or 
adjustment is siill c|ucstioned One gioup v\hich includes many 
psychiatrists and those who use nondiiective thciapv, believes tliat an 
early and elaborate testing piocedurc may hinclei latliei than hclj), 
because it may make ihc patient leel that the lests and the cotinscloi 
will solve his problems or bring about a lav oi able change in his 
situation, whereas in reality the most impoitant changes will only 
come from developing his own insight and motiiatioii Rogers (19 Ki, 
p. 144) writes 

In conclusion ii may be said that the counselor v\ho has come to use ihc 
client’s motivation for giowth as the mamspruig oi the counseling pun css 
IS not opposed to tests, but has lound tlienr u^^atlsfactor) lor promoting 
client growth loi one thing coiiiisclor-administcied tests intcrlcic with 
the process of catharsis, insight, and positive choice which lias been shown 
to be characteristic ol giovsth as it takes plate m therapy It also seems to 
the client-centered counselor that the measurement ol abilities and person- 
ality traits as though ihc\ were static loscs much of its significance m the 
light of counseling experience lire changing and dvnamic use the indi- 
vidual makes of his abilities, the sell-irnti.ifcd changes in peisonalitv charac- 
teristics which 0(1111 as a lesult ol counseling, seem niiich more important 
than the measurement of these fluid entities iii terms whidi give them a 
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spurious permanence Only A\hen (1) ilic need !o tdke tests is a sionificant 
dspcft of the client’s symptomatic bchaMor or (2) il is im])()ssible lor the 
client to be lesponsible for a choice, oi (3) rcscaidi purposes rc‘cpure a 
ineasiucmcni of an aclnmttdl) cli.inj* mg characteristic, do psychonicli ic tests 
seem to hn\e a pinposc with which the nondircctnc counselor t.m agice 

A laige nia]ority of clinical woikeis, while siibscnl)uig to Rogers' 
ideals, feel that a battery of staiulaid tests of ability, iiucicst, and 
mode of adjustment is iisofiil at an eaily stage in ihciapy, both to the 
client and to the theiapist. 

STUDY GUIDi: QUESTIONS 

1 What aspects of a person coinidicate ihe study of personality? 

2 "What IS meant by continuum of activity^ 

3 How can patterns of behavior be acciiraiely ideniifiaP 

4 A\'hat are the distinguishing characteristics ol Sheldon’s cxnerae body 
types’ 

5 '\\ hat IS nicani by dysplasia’ 

6 What w’as Spearman’s analogy to a person’ 

7 AVhai aie the main Kiaej-)clmian syndromes’ 

8 Describe die id, ego, and superego 

9 Describe the usual stages of sexual development accoiding to psy- 
choanalytic dieoiy 

10 'What aspects of personality docs Orlansky stress’ 

11 What temperamental patterns docs Sheldon outline’ 

12 What princijjal factois in personality were found by Caltcll’s fac- 
torial studies’ 

13 Of what importance are measures of dynamic patLcms in education 
and industrv today? 
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TYPES OF ESTIMATES 




This chaptoi clescril)cs certain wa}s to make esimiates ol onesell or 
odaers. It also gi\es uilcs loi die picpaiation of laiing scaleu and for 
the rateis to lollow Lastly, methods Lor detei mining die validii) oi 
ratings are dist us.'.ed. 

CLASSIFICATION OF ESTIMATES 

Because dicy do not lend themselves to dn ect measurement such 
intangible traits as artistic and vocational preferences, aLiitudes to- 
ward wai, racial groups, oi institutions, and die iclaiive efieciiiencss 
of workers in various situations, arc estimated Four types of recorded 
estimates aie common 

a. Invrntoues oi questionnaires usually contain multiple-choice 
items which arc scored and talmlatccl in much the same wa) that 
objective-type tests are 

b. In pan ed-compoii sons each peison is compared with each of 
the otheis lor some attiibutc 

c In ranli-oidei estimates each peison is placed in the order of the 
amount ol an attribute he possesses 

d. Rating stales employ ad]ectives, letters, or numbers to indicate 
on a scale the degrees or amounts ol *in atuihutc possessed 
These types oi estimates aie sometimes combined in vaiious ways. 

Inventories 

When an invenLoiy contains a sciics of items in the form of ques- 
tions in answer to which one is asked to express a pielerence, attitude, 
or judgment of how accui atelv the itenib fit particular persons or situa- 
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ILLUS 156 SAMPLES FROM THE BELL ADJUSTMENT INVENTORIES 

THE ADJUSTMENT INVENTORY 

STUDENT FORM 

(For students of hifh school and eolledc a(e) 

^ By HUGH M BELL 

Yes No * Are you subject to hay fever or asthma^ 

**■ Yes No ^ Do you often have much difficulty in thinking of an appropriate remark to make in group eonversaboa? 

Yes No ? Haveyoubeenembarrassedbecauseof the type of work your father does in order to support Ae family? 

Yes No ^ Have you ever had scarlet fever or diphtheria^ 

“* Yes No ? Did you ever take the lead to enliven a dull party? 

Yes No ^ Doesyour mother tend to dominate your home’ 

Yes No ’ Have you ever felt that someone was hypnotizing you and making you act agamst your will? 

Yes No ’ Has either of your parents frequently criticized you unjustly? 

*** Yes No ? Do you feel embarrassed when you have to enter a public assembly after everyone else has been seated? 

*** Yes No > Do you often feel lonesome, even when you are with peo{de? 

(Reprinted from Adjustment Inventory by Hugh M Bell with the permission of 
the author and of the publishers, Stanford University Press) 

tions. It is called a questionnaire. Since a large number of items can 
be used, many inventories show reliability correlations of from .80 
to .90. Because inventories can be easily applied to groups, they are 
widely used, and more careful research has been done on them than 
on any other type of rating. Their one serious disadvantage is that 
they can be, and probably often are, answered untruthfully, some- 
times unintentionally so. A great deal of work is now being done to 
set up means of detecting falsification on an inventory 

Self-rating inventories, which are to be marked or checked in some 
simple fashion, are more common than other forms For example, the 
Bell Adjustment Inventory (1938) asks that yes, no, or .? be indicated 
on one hundred sixty items (Ulus 156), and the Mooney Check List 
(1942) simply asks that the items that apply to oneself be underlined 
Another inventory used m industry is Hoppock's Job Satisfaction 
Blank (1935) (Ulus. 157) 

Most inventories are scored by simply adding the answers that are 
thought to be or have been shown to be related to some trait or 
criterion 

The recent growth in the number of the size of inventories has 
been enormous. Those intended to appraise vocational interests are 
discussed in Chapter XX. Questionnaires on attitudes and opinions 
are illustrated in Chapter XXI, on emotional and social adjustment 
in Chapter XXII. 

Paired-Comparisons Method 

A scale which has equal-appearing intervals may be constructed by 
comparing pairs of items directly, and then arranging them in such 
a way that the proportion of judges placing one above the next will 
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ILLUS 157' JOB SATISFACTION BLANK 

You are asked to help in a scientific study by answering the questions in this blank Neither yom 
employer nor any of your associates will be allowed to see your answers Your replies will be added 
to those of many other people, and only the group totals will be published Do not put your name 
on the paper Your answers will be worthless unless they are perfectly frank and truthful If for 
any reason you prefer not to tell exactly how you feel about your job, please return the blank unmarked. 


1 

Choose the ONE of the follow mg statements 
which best tells how well you like your job 
place a check mark (>/) m front of that 
statement 

1 hate It 

2 

I dislike it 

3 

I don’t like it 

4 

I am indifferent to it. 

S 

I like It 

6 

I am enthusiastic about it 

7 

I love It. 

8 

Check one of the following to show HOW 
MUCH OF THE TIME you feel satisfied 
with your job 

All of the tune 

9 

Most of the tune 

10 

A good deal oi Llie time 

11 

About half oi the liinc 

12 

Occasionally 

13 

Seldom. 

14 

Never 

15 

Check the ONE of rhe following nh.ch bo«>( 
tells how you feel rbout changing your job 

I would quil rhis lob at onie if I could 

16 

get anytlung else to do 

I would take almcst any other job in 

17 

which I could earn as mudi as 1 am 
earning now 

1 would hke to change both my job and 

18 

my occupation 

I would like to exchange my present job 

19 

for another job in the same line of work 
.1 am not eager to change m\ job, hut 

20 

I would do s > if T could crcL a bet 1 er job 
. I cannot think or any jou for which 1 

21 

would exchange mine 

1 would not exchange my job for any 

22 

other 

If you could have y our cIiok e of all the other 
jobs in the world, which would j ou choose ■* 
(Check one) 

Your present job 

23 

Another job in the same oci upation 

24 

A job m another occupanun 


25 

26 

27 

28 

29 

30 

31 


12 

11 

j6 


37 

38 

39 
10 
11 


Check one of the following to show how you 
think >ou compare with other people* 

No one likes his job better than T like 
mine 

1 like my job much better than most 
people like theirs 

I like my job better than most people 
like theirs 

1 like my job about as well as most 
people like theirs 

1 dislike my job more than most people 
dislike theirs 

I dislike my job much more than most 
people dislike theirs 
No one dislikes his job more than I dis- 
like mine 

W Inch gi\ es > ou more SJ 1 isfact'on ^ (Check 
one) 

You- job 

The things you do in your «parc time 
H.'\e jou r\cr lliouuhL senoiisly of 
c'l.itisring your prcsLnt job^ 

Ha\e you L\c. declined in opportiiiiity 
to chti.igc jour present job-* 

Arc loii feeling todjtj' a trui. taiiiplc of 
the way jou usuilly ftel about jour 
job’ 

The folIoAing questions nu'l not be an- 
sweied i£ thej would i liable anyone to know 
Lh<Lt this paiiei is j our. 

WhaL is jour job^ ^For example. Car- 
penter) 

Age at last biri.hday 

SeY 

D*'te 

Oi the line below, place five check 
marks to show how well satisfied jou were 
with jour hi-^t live job-* Use a scpi'rate 
chock mai-k fo*- each job Von maj p^att 
ea^h mark anywhere on tnc Imc cither above 
oi'o ot the sLalcinints oi between two of 
them If you nave had less than five jobs, 
us< onlv as manj check maike as jou have 
had ioIn Draw a cireli. aioind the check 
mark which indicates joui present job 


Completely More dissatisfied 
dissatisfied than satisfied 


\bout half Mo-e sat'sHed Completely 

anu half than dissatisfied saL'-^fied 


Report Blank Used in Survey of Job oat'siaction, Jsew Hope, Pa , 1933 

(Hoppock, 1935, p 243. By permission of Harper and Bios ) 
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be the same. By this method each item is judged to be better or 
worse than each of the others, by some order of comparison which will 
not suggest the answer. If ten items are to be compared, one must 

make - ~ ^ ^ ^5 judgments. These forty-five compari- 

sons must be made by a number of persons, or by the same person 
a number of times. The more judges, the better will be the results, 
provided all the judges are equally competent, because chance errors 
tend to become less important as the number of judgments increases. 

Fechner (1871) suggested a method for changing percentages to 
standard deviations of a normal curve. Witmer (1894) and Cohn 
(1894) used this same method to investigate aesthetic responses to 
form and to color. Titchener (1902) used it m studies of pitch and 
rhythm and discrimination. A thorough mathematical treatment for 
the scaling of paired comparisons was given by Thurstone (1927), 
who took into consideration the various assumptions which must be 
made in calculating scale values These assumptions and calculations 
cannot be described in this book, but the simplest case assumes that 
(a) the distribution of responses by a large number of judges to any 
one stimulus will be in the form of a normal curve, (b) the judgments 
of difterences between two stimuli will also fall into a normal dis- 
tribution, and (c) errors made m responding to one stimulus are not 
correlated to errors made in judging the second stimulus. 

When these assumptions can be made, and when judgments have 
been tabulated, the data can be arranged as shown in Ulus, 158 Here 
the results of comparing six photographs for excellence in composi- 
tion are shown by the percentages of judges who thought that one 
photograph was better than each of the others. It is shown that 
70 per cent of the judges believed that A was superior to B, etc. The 
scale values for each item can be found by changing the per cents 
into standard scores, from Illus. 129, and then combining them. One 
should consult Thurstone (1928) or Guilford (1936) for examples of 
this technique. 

ILLUS. 158 THE PAIRED-COMPARISONS METHOD 


Item 


Per cents pr^erring each item 



A 

B 

C 

D 

E 

F 

A 


.30 

24 

.15 

10 

.01 

B 

70 


.37 

22 

.10 

15 

C 

.76 

.63 


33 

.05 

.08 

D 

85 

.78 

.67 


.42 

33 

E 

90 

90 

95 

.58 


39 

F 

.99 

85 

92 

.67 

.61 



5TE. 70 


per cent of judgments preferred A over B , 76 per cent A 
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IllusLiaiion 158 also shows a marked tendency for the judges to be 
fairly consistent, but there are bcveial djscrejDancies For instance. 
Item F ^Mis judged siipeiior to E when botli were conipaied to B, 
but Item F was judged to be superior to F b) frl jrer cent ol the judges 
Such discrepanciei, are indication^ that the scale docs not ha\(* high 
internal consistency The judges jnobabK selected dilleient aspects 
of a photograph for comparisons on diflercnt occasions. 

Although the paned-coinpaiisons method is usually considcied a 
precise way ol accining judgnicnts, it is seldom used Irecause siniplcr 
methods seem lo be adequate. 

Rank-Ordci Method 

When the number ol items to be scaled is more than ten and the 
number of judges is huge, the paired-comjrarisons method becomes 
laborious. 'I’hc rank-order method overcomes this difhculty without 
mucli loss ol ellcctivcness It is lar easier to rank thirty items than lo 
make conijiai isons of 3,015 j^airs The lank-oidci method was used 
by Cattell (1903) in studying American scientists, and Wells (1908) in 
evaluating ten traits ol leading American wTiters llollingworth 
(1911) and Strong (1911) applied it in studies ol adveitising appeal 
and memory value 
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lO 

SVMrU S 


Item 




Judges 




Md 

Centilci * 

No 

A 

B 

a 

D 

E 

f 

G 



1 

2 

I 

3 

1 

2 

1 

1 

1 

lb 

2 

3 

2 

5 

3 

1 

3 

3 

3 

50 

3 

4 

4 

1 

1 

4 

1 

4 

4 

Gb 

4 

1 

3 

2 

2 

3 

2 

2 

2 

33 

5 

5 

6 

1 

5 

> 

G 

5 

5 

83 

G 

6 

5 

6 

G 

b 

5 

6 

6 

100 


♦Thcic IS dll ciioi inciodiiccd line because of the small iiumbei of items used 
The ceil tiles aie all too high brcau'ic the) leprcsnu ihc iijjpei limit of the ranks 
of the Items No simple wa^ of collecting this ciior is asailable, but vviicn 30 oi 
more items aic us(d ihe ciioi is \ei> small 

In computing scale values by this method, it is neccssaiy to have 
the items ranked by a number of judges or by one judge a number of 
times The median rank lor each item is found rather than the mean, 
since tire median is not influenced as much as the mean by exticiiie 
scores which sometimes are the result ol eirois. A median rank loi 
any item is found by ariangmg all ol its ranks in order ol si/e, and 
then selecting the middle one For instance in Ulus 159 the first item 
has the following lanks 1, 1, 1, 1, 2, 2, 3, and hence a median rank 
of 1. 
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There are several procedures for changing median ranks into 
scale values. One of these, proposed by Hull (1928), assumes that the 
items to be ranked have a normal frequency distribution This as- 
sumption is probably true enough when the items are a large num- 
ber of biological phenomena selected at random. If one had one 
hundred items to rank, each rank would be a centile, since it shows 
the per cent of items that fall below a particular item When any 
number of items are used, the centiles of the items can be changed 
from those in Ulus. 129 into standard scores. The lowest item may be 
taken as the arbitrary zero point and the other items assigned scale 
values from zero according to standard differences 

Two other important methods for scaling ranked items do not as- 
sume that the items are normally distributed, but that they may have 
any form of distribution. In one of these Guilford (1936) calculated 
the per cent of judgments which place an item above a composite 
stayidard. These per cents were changed to standard deviations or 
standard scores as before. In the other Thurstone (1931) calculated 
the proportion of judgments which place each item above or below 
every other item. This method converts the ranks into paired com- 
pansons and follows the procedure for that method. 

A variation of the rank-order method, called the method of equal- 
appearing intervals, is useful when a large number of items is to be 
ranked. Instead of ranking all in order, the items are placed in piles 
which appear to be equally spaced. Illustration 160 illustrates the 

ILLUS 160. SCALING BY THE METHOD OF EQUAL-APPEARING 
INTERVALS 


Item 



Ptles 



No, of Judges 

Median Values 

Q 


1 

2 

3 

4 

5 




A 

18 

4 

2 

— 

— 

24 

1.16 

.33 

B 

4 

10 

8 

2 

— 

24 

2 30 

.52 

G 

2 

5 

9 

6 

2 

24 

340 

.72 

N 



6 

10 

8 

24 

418 

.62 

T 




4 

20 

24 

4 90 

.27 


Note Item A was placed in the first pile by 18 judges, m the second pile by 4 
judges, and m the third pile by Z judges, etc 

procedure used where there were five items ananged in five piles by 
twenty-four judges. More items were used but only 5 are shown since 
they are enough to illustrate the method. The median value of each 
item is preferred to the mean, for the items on the end piles will not 
be normally distributed The Q ^^ich item shows the range of the 
middle 50 per cent of judgments. In Ulus 160 the ranges are greater 
for Items which appear in the middle piles most frequently. This 
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situation is commonly found, and it is due in part to the fact that the 
scale does not extend far enough to give normal distribution to the 
highest and lowest items. The Q of some items is thus reduced by 
their extreme position The O is also an indication of the inability of 
judges to agree upon the relative position of an item. There may be 
disagreement owing to random eriors in discrimination or to differ- 
ent standards of excellence For example a huge O niav be an imlitii- 
tion ol chance enors or ol a lack of intonal conM^tenc) in the scale 
In cither case the items ^\ith the sniallei Q’s aie pieleiied loi scale 
const! Liction 

Sanloid (1008) desciibed the ineiliod ol cqiiiil-appCMiing inteuals 
in scaling 'v\cjglus ol ens elopes Thorndike (1910) had lorr\ judges 
SOIL one ihousand samples ol penmanship into eleven pilc'i Holhng- 
wonh (1911) had thmy-ninc ]okcs classified in ten clegiccs ol humor 
Hillcgris (1912) had judges classil) Lnglish composinoiib into classes 
Thiirstone (1928) apjilieci this mctJiocl to establish attitude scales to- 
waid a ninnbci ol jiolitical oi social issue:, such as the esiablishcd 
chinch (Ulus 15) Appioximatel) ihiee hundred statements uhich 
voiced appioval, indilTereiice, oi disajjpiosal ol the issue ^\cic sorted 
into nine piles The median value and the O ^\eie found lor each 
Item and liom these an absolute-scale scoic i\as clenscd 

Finall), forty-five statemenfs i\ere chosen to icpicsent nearly equal 
steps ol attitude ranging from \ci\ fa\orablc to veiy unlavoiable. 
Siarenients t\iLh the smallest Q’s were pieicned and two checks of 
\alKlitN, not dc^cuhccl here, weie aj^phed The caic used by 'rhui- 
stone and his students in the corustiuction oi approximately ihiity 
dilleicnt scales Jias made them among tlic most highlv valued 

Remmers and his students (1934) show'ed that individnals made 
ncaily the same scoics when a group of item:, w’crc j)resented cither 
in a haplia/aicl outer or in the oidei oi tlicir scale values Ihc latter 
allowed a much more lapid scoiing thrin the loimei. Ihcy also 
pointed out that a great economy in attiiuclc-scalc construction could 
be cllcctcd it a geiicial stale ol opinion towaicl a class of social phe- 
nomena were devised They forthwith published six generalized 
scales ol atti Lucies towxud 

1 \n institution, such as war oi Sundav obsc'ivaiKe 

2 A race, such as the Clunese oi the Negro 

3 A honicniakmg ruiiviLv, such as child care or prepaialion of a meal 

4 A moial or social practice, sut h as petiiiig or drinking 

5 An occupation 

6 \ school sub |C‘(t 

"Ihcse gcneiali/ecl scales w’ere composed by lollow’ing the procedure 
devised by Thurstonc. '1 he correlation bciwccri the generalized Form 
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A (Ulus. 161) and Thurs tone’s Scale of Attitude toward Communism 
was .816, and between the generalized Form A and A Scale of Sun- 
day Observance, .83. Correlations between the generalized Form B 
and Thurstone’s Scale of Attitudes toward the Negro was .669 and 
attitudes towaid the Chinese, .72. These figures doubtless indicate 
similar attitudes in both scales, although the generalized forms usu- 
ally contain a few items that are not well adapted to some of the 
situations in which they may be used. 


ILLUS 161 ATTITUDE TOWARD AN INSTITUTION 


Scale 

Q 

Item 

Item 

Value 

Value 

No, 


11.1 

08 

2 

It is the most admirable of institutions 

10.2 

16 

9 

It is a strong influence for right hvmg 

9.1 

25 

17 

It IS necessary to soaety as organized. 

82 

20 

20 

It does more good than harm 

74 

32 

21 

It will not harm anybody 

6,1 

26 

23 

It IS necessary only until a better 
can be found. 

49 

1.9 

26 

It does not consider individual dif- 
ferences. 

44 

21 

29 

It represents outgrown beUefs. 

33 

1.7 

32 

It is too selfish to benefit society. 

28 

1.9 

36 

It is hopelessly out of date. 

19 

13 

42 

It will destroy civilization if it is not 
radically changed. 

1.7 

10 

44 

It benefits no one. 


(Remmers et aL, 1934, Samples from Form A, p. 19 By permission of the 
Editor, Journal of Social Psychology,) 


The generalized forms have also been criticized by Stagner and 
Drought (1935) because they found that scale values changed con- 
siderably when tlie same form was used for different items. 

Another method of scaling which avoids the neutral items of the 
scales just described, and yields a reliable score with a minimum of 
labor is that described by Likert (1932) He presented a group of 
persons with a large number of items designed to measure attitude 
toward a particular institution, such as attitude toward labor unions. 
For this purpose five categories were generally used: strongly agree, 
agree, undecided, disagree, and strongly disagree Weights were as- 
signed in accordance with the assumption of normality of the things 
being rated and the proportion of the total group of making each 
choice. For most items he found that small errors were introduced 
by using the weights 1, 2, 3, 4, and 5. Scores for each subject were ob- 
tained by adding the weights of his item responses. In order to select 
the most discriminating items, the group was divided into those with 
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high scores iind those with low scoies, and the peircntagcs of each 
subgroup niaiking each lesponsc to each item weie computed. The 
most discumniatmg items weie then selected lor inclusion in his 
scale. 

Rating-Scale Mcihocls 

Rating methods use scales which have been established by one of 
the methods jusr described, or use arbjti*iry scales 

The later is asked to indicate on the scale the position of each 
person or item to be rated Ihis proccdiiic less time consuming 
than ranking items and is nioie interesting to manv piclgcs than the 
other procedu ICS It is uiilely used for e\aluaimg bchcls, pi clei cnees 
personal traits, and indiisLiial efficiency 

L Earns of Bntnifr SinJcs 'Ihe final loim in which a rating scale 
IS cast IS important because it determines to some extent the acciuacy 
and the speed with ■which a i a ting can be made Four common lornis 
will be clescj ibcd. 

a. Classified form In this form the rater is asked to indicate his 
judgment by marking the name of a class common classification 
is. excellent, good, fair, and poor, and initials oi numbeis may be 
used to designate these headings (See Illus 162 ) 

II I US 162 R\ riNG sc M 1* FOR I t IMS COXC LRMNG TH£ 

\ VI UC OI' Mt SIC 

As a first «!tcp in making this scale we want a nOniber of persons to late (hece 
statements by assigning them to nine (lifferent classes \Vc will Ctill lUL?e cla‘-ses 
A/b/C, D, £, F, G, TI, and I If you fiiid a statement whit h you bchc\ c c\pies«5ea 
the highest appreciation of the value of music, underline th(' lettei A For a state- 
ment which seems ncutial 01 non-committal, underlined (the middle letter) while 
for those statements which express the strongest dcpicciation of music, underline I 
Other degrees of appicciation or depieciation may be indicated by undcrhmng 
one of the intermediate letter^ 

ABCDEFGHI 23 Music stimulates and encourages me in m> life work 
A B C D K F G H 1 24 To me music is of no greater or lesser imj)ortance 

than any other of the arts and sciences 

(From Seashore and Hevner, 1933 p 369 Bv permission of ihe Editor, 
Journal of lioucU P^holoqy ) 

Another classification uses statistical terms — above aveiage, aver- 
age, and below average. Sometimes a rater is asked to classify persons 
in fourths or filths or thirds of a group 

In rating one’s pi efei cnees the words like, indifferent, and dislike 
are often used Sometimes one simply classifies the item using one of 
two choices — picsent or absent, yes or no, like oi dislike, agree or 
disagree. 
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Classifications are often combined with other rating methods. For 
purposes of summary and comparison, classified ratings are usually 
given numerical scores. 

6. Descnptive form. In this form each step in the scale is desig- 
nated by a word or phrase which describes, sometimes elaborately, 
particular behavior patterns (See Ulus 163) 
c. Graphic foi m In this form a straight line is provided, often with 
numbered spaces, as m Illus 163, and the rating is made by placing 
a mark on the line to designate one’s judgment. The graphic form 
is rarely used by itself, but is usually combined with a classified or 
descriptive form, 

ILLUS. 163 DISTRIBUTION OF RATINGS OF NURSES. UNIVERSITY OF 
MICHIGAN SCHOOL OF NURSING 


Rating of Performance in Service 


Directions • 

1 I 

1 ! 

Item #5 

1 . .1 1 

1 

1 1 A 

MD. Q 

Adjtistment 

Sometimes 

Slow to 

Learns new 

Quick to 

Very quick 


to 

at a loss 

adapt to 

arrange- 

adjust to 

to respond 


sitiiaiions: 

in familiar 

new situa- 

ments 

new rou- 

to emergen- 



situations 

tions 

fairly soon 

tme 

cies 





Total 




1st Rater: 

10 

20 

IS 

3 

2 

2 27 67 

2nd Rater * 


9 

32 

9 


301 37 


Note. When the medians and quartile ranges were calculated, each space was 
allotted one point on a scale from i to 5 The lowest value was assigned to the 
space at the left, and the highest to that at the right When the values were in- 
terpolated, the whole points were assumed to be located m the middle of the space. 

d. Man-to-Man form. This form (Illus, 164) was designed to 
make comparisons more definite by placing in blanks on a key 
sheet the names of men known to the rater to represent standard 
levels of ability. Each person to be rated was compared to this 
standard and assigned a numerical score In 1919 this man-to-man 
scale was used for rating a large number of United States Army 
officers. 

Most of the rating methods that have been described have serious 
shortcomings, such as the following* 
a. They use general character traits or descriptive terms, such as 
leadership, poise, and work attitudes, which are difficult to define. 
It is almost impossible for all raters to give the same meaning to such 
a term as leadership, but it is possible for them to agree on what 
the ratee did with regard to planning, handling grievances, training 
his workers, or other activities involved in leadership 
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ILl.l.S 161 TA'ITI’D S I VTUS \RM\ R\1I\GSC\IE 

I Piiy«icM,QrvLnii, 15 

rhys'cjue, bearing, neatness, voice, energy, en- High . . .12 

durance Consider how he impresses his Middle ... .9 

command in Uiese respects. Low . . 6 

Low csL 3 

n ISTILUGENCL Il.ghesl . . . .15 

Accuracy, case m leaining, ability to grasp IFigli. . . 12 

quickly the point of \iew of commanding oiTi( Cl, Middle ... 9 

to is»sue deal and intelligent orders to estimate T.ow , ... 6 

a new situation, and to arrive at a sensible de- Lowest . 3 

cision in a crisis 

ra LLiDPR^rrip Hidicst 15 

Tnitiative, force, self-reliance, decisiveness, tact, High . 12 

ability to inspiie men and to command their Middle . 9 

obedience, loyalty, and co-oi)cration Low 6 


Lowest . . 3 


IV Person \LQn\LiTns Ilichcst ... li 

Industry, dependability, loyalty, readiness to High . , .12 

shoulder responsibility for his own acts, free- Middle 9 

domfiom conceit and selcshness, leadiness and Low . 6 

ability to co-operate Lowest 3 

V. Value TO iiTF Servtci Highest. . IS 

Professional knowledge skill and c'^perience, High ... 12 

success as administrator and mstiuct or, abjht> ^^l(ldlc . 9 

to get results. Low 6 

Lowest . . 3 


(Note from p 259 of The Personnel Manual ) It w’lll he noted that it ’s really 
five sepaiate scales, one each for each of the five essential qualities of an olliccr, 
namely physical qualities intelligence, ]cad(*rship. personal Qualities and general 
value to the scivice Fach of the spaces is to be filled with the name of an 
officer who is taken as a stand.ird for tlie qualification and the degree of the 
qualiiication indicated by the terms, “highest, ’ “nigh, ’ “middle,” “low ” and 
“lowest ” 

Each of the otlicers is ordinaiily of the same rank as the rater and hence tlic rank 
next supciior to That of the otliccr to be rated Each of tliein is well know n to the 
ratei and sta^^ds in his mind as an evcmplar of the cjualiiication WitJi each of 
them he compares the oifiter to be lated on a man-lo-man basis to find which one 
he most nearly equals in that qualification The officer to be rated is compared 
witli ofucers of superior lank because the object is to discover hi-a fitness for pro- 
motion 

The accuracy of the rciulL depends largely upon the care w itli a Inch the lating 
scale IS constnicic'd When mstructiors are followed closely and rateis do their 
work ennse lentiously the ratings show a high degree of accuracy and uniformity 

{The Personnel Manaal, Committee on Classification of Personnel, Adjutant 
General's Dept 1919, p 260 liy pci mission of the Govt Print Office, 

Washington, DC) 
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b. Rating methods employ adjective or point scales with such 
poorly defined steps that most raters can seldom distinguish between 
them. For example, the difference between average and above average 
is almost always debatable, because one rater*s average does not cor- 
respond to another's. 

c. Rating methods combine two difficult psychological processes 
in one judgment; (1) the observation and recording ok performance 
and (2) the evaluation of performance. This combination usually 
lesults in inaccurate observation and evaluation. 

d. In many military and industrial situations practically every 
one IS rated above average, and those who are really doing excellent 
work are not given much more credit dian mediocre workers This 
leniency on the part of raters is due to a desire to have friendly re- 
lations with those supervised, and to give them as good ratings as 
other supervisors in the organization. 

In order to avoid overgenerous rating based on fragmentary evi- 
dence and expressed in vague terms, a rating technique must (a) 
yield specific evidence of past performance, (b) separate the observa- 
tion procedure from the evaluation, and (c) base the evaluation upon 
well-validated factors in job success. 

The Forced-Choice Method 

To meet these specifications the United States Army personnel 
Research Section developed the forced-choice technique (Sisson, 
1948). In the forced-choice rating the rater is furnished from twenty 
to thirty small groups of specific descriptive terms and required to 
indicate which items in each group are most typical and which least 
typical of the ratee Each group of items usually contains two favor- 
able items (a and 5), two unfavorable items {c and d), and one 
neutral item (e): (tz) commands respect by his actions, (&) cool-headed, 
(c) indifferent, (d) overbearing, and {e) quiet. 

In order to separate the process of observation from that of evalua- 
tion, six hundred items were all pretested in such a way as to yield 
two indices for each item, one for discriminative value and the other 
for preference value. All the items were applied in experimental 
form to two gi'oups of officers, which had been carefully selected to 
represent the most and the least competent. The discriminative value 
of an Item was the difference between the percentages of officers de- 
scribed by the item in the most competent and the least competent 
group The preference value was secured by having a large number 
of officers rate each item for apparent degree of praise or blame. 

In each group of items the two favorable items have nearly equal 
preference values but different discriminative values, and the same 
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IS true of the two unfavorable items In scouiig these foims, only the 
discriminative items are counted Since these aic not known to the 
rater, he cannot know if he is rai ing the pei son rchitn cly high or low, 
and therefore cannot play fa^onies The icsults on large nunibeis 
of officers showed that the forced-chojce i tilings weie not as skewed 
toward the desirable end of the scale as wcie ^cile ratings ot ihe 
same officers, and that the scoiing key^ could not be gue»sed oi 
detected by any means usually tivaiJablc to ihe ititcrs The foiccd- 
choice method is promising because it locjuires lepouing ol obseiva- 
tions, and eliminates rater judgments conceinmg the ovei-all evalua- 
tion of performance. The combining oi two pans ol items in one 
group IS a device used to oveicome rater lesistanie to unlavoiable 
items. A great deal of careful research will be needed in jp})I)ing this 
method, because the discriminaiive values ol items w'lll cloubtless 
change as the criteria of success are changed or impioved. 

Nominating Techniques 

Another rating procedure that has been used in only a lew militaiy 
situations but in many school situations asks the laters to noiuiiiate 
or name persons in the group for jDai titular loles Thus Wlierry and 
Fryer (1949) found that “buddy nomiiiaiions” ol (i\c out ot a 
section of twenty men, who possessed the most desnable tiaiis for an 
army officer showed much greatei retest reliability over a 3-month 
period than ratings ot ten leadership ciiialiues on an adjective scale 
These results are probably due m pai t to the differences in [>ioccdure 
It is easier to select the five top people in a group than to late or 
rank members of the whole gioup Also it is easier to nominate a 
specific person than to rate him with icgard to abstract and hard- 
to-define qualities. 

CONSIDERATIONS IN RATINGS 
Relative Intangibility of Items 

Regal dless of the form used for rating, some items ajipcar to be 
more difficult to discriminate than others With regard to physical 
judgments the finer discriminaiions aie tlcarly the most dilhcult to 
discriminate. The results of diftereniiating between peisonal traits 
that are complex, that have been ambiguously defined, anti that are 
rarely observed, vary greatly. This fact is clearly shown in Ulus 165, 
which shows the ratings for one boy on ten tiaiLs by bom ten to 
fifteen observers. The observers wTie consisiciu w'lth one another in 
rating resistance to authority, seli-asscrtion, social responsibility, and 
popularity. Less uniformity is indicated by the longci lines ioi the 
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ILLUS 165 CONSISTENCY OF RATINGS OF DIFFERENT TRAITS 
BY VARIOUS OBSERVERS 


Poise 

Popularity 

Iieadership 

Resistance to Authority 

Self-assertion 

Interest in Opposite Sex 

Social Responsibility 

Awareness of Audience 

Appearance 

Grooming Activity 



?0 


Highest and lowest ratings are indicated 
by the extremes of each horizontal line. 

(Drawn from data in the files of the Institute of Child Welfare, 
Univeisity of California. Courtesy of H E Jones.) 

ratings of poise and leadership. Great lack of agreement is shown in 
ratings of interest in the opposite sex, awareness of audience, ap- 
pearance, and grooming activity The last two items, which one 
would think could be judged uniformly, showed more variations than 
such intangible items as leadership and poise These results should 
make one realize the need of a check on the inconsistencies of raters 
in any situation. 

Number of Desirable Steps 

The number of steps in rating scales range from two to a hundred, 
or more. Four or five steps are most frequently used m ratings of 
attitudes and traits in fifty-two samples which have been collected 
by the writer In the self-rating of academic or vocational preferences 
three or four steps or classes are most common. The most satisfactory 
number of steps to use is that number which can be clearly dis- 
tinguished by the j‘udges in a reasonable time, without distorting 
the results This may be determined by trying out scales witli various 
numbers of steps Symonds (1924, 1931) found from empirical evi- 
dence, that a 7-point descriptive scale was a little more reliable than 
a 5-point scale when the judges were interested and definite traits, for 
example, neatness, were clearly defined When a trait, for exampb, 
tact, was vague, or when the judges were immature or not interested, 
only 4 or 5 points were clearly distinguished, 

Champney and Marshall (1939) compared two procedures in scor- 
ing a 7-centimeier graphic scale of home conditions In one proce- 
dure a 7-point scale was used, one point for each centimeter. In the 
other, a 70-point scale was used, one point for each millimeter Both 
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procedures were applied to two aJteinati\e forms applied wuhin a 
3-week inter\al 'fhe two forms yielded retest conclaiions ot 6()'3 loi 
the 7-point scale and 76G lor the 70-poinL scale This increase iii la- 
test reliability is moic than could be expected by the usual methods 
of correcting lor coaise scaling It indicates that the ]iidgnicnts weie 
more consistently lecordcd on the finer scale 

Another ^^ay oL evaluating the nuiubei ol steps to be used is to 
find out whethei or not all the judges use all the stcjis oi cluster the 
cases in one oi luo steps Judges have been lound to diflei a good deal 
m this respect Thus Tllus lfi3 shows the distubuiion oC ratings as- 
signed to the same filt\ nurses b) two super visois The lesiilts show 
that the second supeiMsor used only thice ol the steps while the fust 
used all five II some steps in a scale are ne\ei, or almost ne\ei used, 
they probably have hi tie ellect on the scores Occasionally, liowc\ei 
a lowest step which is ne\ei used causes the step next to it to be used 
more frequently than might otherwise be the case. 

Halos and Logical Eirors 

Ratings on a number of tiaits by one person aie subject to two 
sources of error w’hich arc sometimes rather important One source 
is called a //a/o efjeeij the other a logical eiior A halo effect is a 
tendency to classify a peison on the whole as good oi bad and then 
to rate him on all traits nr keeping with this opinion \ logical error 
is introduced by a s]oecial inteiprei.iiion oi the task. 

Halo effecis and logical enois, of coin sc, defeat one of the mam 
purposes of latiiig scales, namely, to find out a person's lelative 
strength and weakness Ihey can sometimes be a\oicIcd by lariiig all 
persons in a group on one trait at a time, instead ol lating one person 
on all traits belore considering the next j^eisoii Another device lor 
avoiding halo elLccts in latiiigs is an irregular aiiaiigement of the 
steps or classes A common arrangement of items is to have all 
the most desnahlc chai acteristics pl.iced on the iight-hand side oE the 
page, the least desirable on the lelt, and the average traits in the 
middle This ariangcment may lead to routinely placing checkmarks 
down the page by one who is too busy to use the rating caielully If 
the most desnahlc tiaiis are placed soinctiincs on the light and some- 
times on the lelt, the rr^tei is lequired lo read the items moic care- 
fully. Mathews (1927) lound that there is a consideiablc lendenry 
in multiple-choice tests to check the responses printed near the lelt, 
rather than those punted near the right, when choices aie printed 
across a page \\'hen choices aie set in columns, Lhcic is a marked 
tendency to check fhose nearer the top more fiequently than those 
nearer the bottom ol a seiics 
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Halos are systematic errors which can be exposed by correlation 
techniques. An accurate rater will secure scores which correlate well 
with a true criterion, and he will also be able to repeat his perform- 
ance with a high self-consistency. A rater who depends upon chance 
will usually show a low correlation with the criterion and also a low 
self-consistency. A rater who is markedly biased will have a low cor- 
relation with the criterion but a high self-correlation. Adams (1930) 
has shown that peisons often differ a great deal in their ability to 
estimate size of circles, lengths of lines, and also personal qualities of 
their friends. He found that a good judge of self tends to differ from 
a good judge of others, by being rated by his colleagues as happier, 
more sympathetic, generous, and courageous. 

In relationships among personal traits, halo effects introduce errors 
which raise or lower correlations spuriously. When correlations be- 
tween two dissimilar traits are as high as .90, halo effects may be 
suspected The true relationships may be approximated by pooling 
the ratings of several persons on the assumption that personal prej- 
udices will tend to cancel one another. This assumption brings up 
the interesting question, how many persons* ratings should be pooled 
to eliminate halo effects? Bradshaw (1930) found that the rating of a 
personality trait of students in college attained a correlation of .80 
between two trials when at least five ratings were averaged. A reliabil- 
ity coefficient of .90 would be reached theoretically when the ratings 
by ten such persons were averaged. He reported many variations from 
these usual findings which depended upon the trait, the accuracy of 
the descriptions, and the raters 

A logical error is one in which traits are rated alike because the 
rater has some reason for believing that they are similar or occur to- 
gether. He substitutes logic for the direct observation of behavior. 
Newcomb (1931) found that ratings on a large number of personal- 
ity traits in boys correlated on the average much higher (.493) than 
records of observed behavior of the same boys in the same traits 
(,141). He believed that the raters were not particularly prejudiced, 
but had adopted various stereotypes of personality which influenced 
their ratings. 


MERITS OE VARIOUS METHODS 

Each of the above metliods of estimating has advantages. The self- 
inventory method, described more fully m Chapter XXII, is widely 
used in schools, clinics, and industry, when the subject is able and 
willing to cooperate. It seems to give fairly reliable indications of 
interests, of attitudes toward policies and social institutions, and 
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of useful methods of overcoming obstacles. Inventories are easy to 
score with multiple-choice methods However, they cannot be used 
with children or with adults who cannot or will not cooperate. They 
can, with few exceptions, be filled out so as to conceal the truth to a 
considerable extent 

The paired-comparisons method is usually tedious and time-con- 
suming, if many items are to be compared It does seem to yield more 
stable results than other methods, however, and some ingenious forms 
will minimize the labor involved. 

The rank-order method is not so time-consuming as making paired 
comparisons, and it seems to avoid halo effects to some degree. 

Rating scales are used both in schools and in industries more than 
any other type of appraisal They are the least reliable and most 
subject to halo, but are much more economical of time, labor, and 
materials Because in the forced-choice method the rater does not 
know what scores will be given to his ratings it is a distinct improve- 
ment over the other methods. 

The relative effectiveness of various methods of comparison has 
rarely been studied, but Hcvner (1930) sealed samples ol hand^vnting 
by three methods paired-comparison ranking, and cqual-appcaiiiig 
intervals She found that the scale values oi the first two methods were 
similar to each other but different from the values of tlie third The 
third gave less precise discrimination among the better sam]Dles of 
handwiiting Ferguson (1939) reported, however, close agreement 
among scale values according to the three methods, and concluded 
that the cqual-appeaiiiig interval method v\as superior in lx>th 
economy and accuracy In this method the scale value ol an item was 
not greatly affected by the inclusion or exclusion of other items 
Aesthetic judgment, according to Biillough (1908), is disturbed by 
pahed compaiisons Conklin and Suthciland (1923) found that con- 
sistency 111 itidging the humor ol jokes w as less v\hen a r anking method 
was used than when a rating scale was employed 

Wheriy and Fiyei (1949) reported repeat-test leliability on the 
following three types of ratings as measured by coiielations ol ratings 
repeated alter one month and again after thiee months Lor a sample 
of eighiy-two officer candidates* 

a Buddy normnatwns were made by having each cadet nominate 
the five men in liis section who possessed the least desn-ablc personal- 
ity traits for an \i my officer and ihe five men wdio possessed the most 
desirable traits The score was the Limes mentioned as “most de- 
sirable” minus times menLioncd as “least desiiable,” divided by the 
total possible mentions, that is, an average of nominations by all the 
men in a section (approximately 20) 
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b. Buddy ratings were averages of ratings made by all students in 
a section, using a graphic adjective scale of ten leadership qualities. 

c. Eight Junior Tactical Officers rated all the students in the class 
known to them, using the same scale as in (/;). Three judges were usu- 
ally available for each cadet 

Buddy nominations had correlations of .75 for the one-month 
interval, and 58 for the 3-month. Buddy ratings of leadership cor- 
related .76 for the one-montli interval and only .17 for the 3-month. 
The ratings by superior officers showed correlations of .58 for the one- 
month interval and .28 for the 3-month Some faiily clear results 
emerge. The longer the interval between the ratings, the smaller the 
correlation. The buddy nominations were much more stable over the 
3-month period than the buddy ratings, which means that the rating 
procedure using ten leadership qualities introduced more variability 
among the raters than the nominating procedure Thus the cadets 
still considered many of the same persons to be the five best or five 
worst as leaders, after 3 months, while they failed markedly to rate 
all candidates consistently, using a group of verbally defined traits. 
The practical significance of this finding is great, for the nominating 
technique is also easier to administer, score, and interpret. 

The superior tactical officers* ratings had lower repeat-test reliabil- 
ity than the buddy nominations for both intervals This is probably 
due to the smaller number of judges, as well as the smaller degree of 
acquaintance of judges with candidates 

The superior tactical officers* judgments correlated with buddy 
nominations 36 after one month of training, .45 after 2 months, and 
.53 after 4 months, which means that the Tactical Officers evaluated 
leadership at the end of 4 months about as well as the cadets did at 
the end of one montli. 

The United States Army Personnel Research Section Report No. 
672 (1945) compared the validity of five efficiency reporting methods, 
which included two adjective rating forms each, with about ten traits 
to be rated, a forced ranking form, a check list, and a forced-choice 
form. These were validated against the ratings of carefully selected 
groups of officers. The forced-choice method showed clear superiority 
over the other methods on several large groups of officers. Predictions 
could be increased slightly by combining the forced ranking and the 
check list with the forced choice. 

CONSTRUCTION OF ITEMS 

The same considerations which were presented for the construction 
of test items (Chapter IV) apply also for rating items, particularly 
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self-rating inventories These mles are designed to eliminate ambi- 
guities and to supply important information more quickly and ac- 
curately than is possible by other means. One should use simple 
language and positive statements. The character-sketch items (Ulus* 
228) seem to yield more accuracy than many shorter items which per- 
mit a vaiiety of interpretation, but both kinds are useful. 

The Wording of Items 

Benton (1935) applied the Personal Inquiry Form of Landis and 
Zubin to 20 normal adults and 20 psychotic patients, and then asked 
each to explain his answers From 44 questions he obtained approxi- 
mately one hundred interpretations. Of the forty-four questions 20 
displayed qualitative diflEerences, 17 showed quantitative differences, 
and 7 had no differences. In the interpretation of “Have you ever 
felt that life is a dreamt” a qualitative difference is seen Some thought 
that this meant that life is very unreal — like a dream; others that 
life is very easy and beautiful. A quantitative difference in interpreta- 
tion involved ambiguity in the amount of a trait or quality. 

The various interpretations were cast into another form of one 
hundred items called the Intel pi oration Qiiestionnaiic Both the 
Personal Inqiiiiy Form and the InteipTctation Qiiestionnaiic 'iscrc 
applied twice to 90 iioimal and 100 psychotic individuals at intervals 
of from 3 to 21 days \nsA\ers on both ^\elc tabulated to show differ- 
ences in the amounts oi \aiiation in inicrpietation I'hc differences 
were most appaicni among items which had been changed to ictluce 
the quantitatne vaiiations I'hus, the onginal item, ‘Do )ou leel 
mentally iniciioi lo your frieiuL'" was changed to, “Do >ou Icel 
mentally inieiior to most ol \our liieiids^” T.hc original foim is 
quantitatisely moie vague ihan the ic\ised form The otiginal item 
did not differentiate between noimal and psychotic gioups, but the 
revised item bi ought out large differences Tins study shows tliat the 
usefulness oi items is greath inci cased by changes which i educe the 
number ol possible inieipieiaiions The revised items piovcd to be 
more reliable on a retest and moie disciiminaiivc ol psychotic tend- 
encies than the onginal iicms horn w'hich ihe\ w'ere clerned Nearly 
ever}' item a\ailable today can be considerably claiiiied 


EMOTION.VL REACTIONS TO ITEMS 

When using an} rating some persons wdl be on the defensive, and 
some will ha\c the dcsjic to show oil Man} invesiigatoi s have emi:)ha- 
sized the superiority ol somewdiat indncct measures ol motives and 
adjustment tendencies Mailer (1932) repoitcd that although ap- 
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proximately 70 per cent of pupils felt somewhat embarrassed^ by 
direct questions about their own shortcomings, only 43 per cent were 
irritated by impersonal questions. He therefore preferred the char- 
acter-sketch type of test in which one identified oneself with a pen 
portrait. Rundquist and Sletto (1936) preferred to state their items in 
the third person rather than in the first. 

Another indication of emotional i espouse is seen in responses to 
positive and negative items. Lorge (1935) reported that some people 
showed a tendency to check either positively or negatively any kind 
of question on the Bernreuter Scale. This tendency was greater in 
some persons than their consistency in checking one item yes, and its 
opposite no, 

R B. Smitli (1932) found that positive and negative items which 
were designed to be opposites of each other did not call out opposite 
responses For example, “Feels he has failed at most everything he 
tried“ was not statistically opposite “Feels he has succeeded in most 
everything he tried ” Smith believed that the two items called out 
different emotions 

Rundquist and Sletto (1 936) found that their negative or gloomy 
statements were more discriminative of morale than their positive or 
happy statements. The positive statements tended to be answered 
more uniformly than the negative ones, and also to have smaller 
correlations with total scores in all except the scale for family ad- 
justments- 

These studies indicate clearly that reactions to specific items vary 
in ways which can be measured, and suggest the possibility of ap- 
praising emotional behavior by comparing responses to various items. 

Is a rating more truthful when the raters are anonymous? is a 
question that is often raised Spencer (1938) found that students in a 
fairly large group admitted that they would have answered some 
questions untruthfully had they been required to sign their names. 
These students were those who showed the gi'eatest mental conflict 
scores^ hence Spencer concludes that signing one's name materially 
reduces the effective selection of those who need counseling help. 
Olson (1936) and Moore (1937) also found that frankness increases 
with the degree of anonymity of the report. 

According to Johnson (1934) administration should take cogni- 
zance of the mood of the subject He reported marked differences in 
dominance scores of subjects when they felt cheerful and when they 
felt depressed. 

Hartshorne and May (1929) found a large increase in names given 
for cooperative children by classmates when the raters' names were 
placed in the ballots, as compared with anonymous ballots, but not 
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much change in names of noncooperative children. When raters’ 
naihes were required, children centered their votes on positive votes 
and fewer names, and also voted less often for themselves. 

A distinct departure from the conventional rating schedule in 
general use is the Employee Guidance Sheet developed by the Ala- 
bama State Personnel Department and reported by I. S Smith (1944). 
The five descriptive phrases accompanying each of ten traits to be 
rated are couched in language intended to help to encourage the 
employee rather than to report findings in a coldly impersonal and 
blunt manner. The following illustrates the effort that was made to 
humanize the report and stimulate the employee to greater effort: 

Usual Form 

Quality of work* 

( ) unusually high output 
( ) high output 
( ) normal output 
( ) limited output 
( ) insulficicni oulput, unsatisfactory 

Alabama State Foim 

Quantity of v'Oik (just a U lendly suggestion) 

( ) lixccptionally high outjiiit Keep it up 
( ) hetrer than average Good going 
( ) Mooting our requiicmcnts 
( ) \ou could do more Tr)' hauler 
( ) \<)u could do a lot moie Try much hauler 

INTENTIONAL MISREPRESENTATION 

The only way to eliminate intentional misrepresentation in rating 
eitlicr oneself or others is to construct the lating situation so that the 
rater ^\iil not know how his judgments are to be scoied The best 
single a]>proach is probably the foi red-choice method discussed 
earlier, when one of two attributes, which are ecjually acceptable 
socially, must be chosen One attribute had been found to distinguish 
between the highest and the lowest third of officeis, while the other 
did not This technique should have wide application in both school 
and industry 

Several inventories now’ include scoies w’hich sliowr any tendency 
to he oi cxaggeiate or to give bi/atic answ-cis (Chaptei XXJl) 

Intentional misrepresentation can be reduced to a marked dcgice 
by eliminating lear of reprisals and by convincing the rater that his 
honest appraisal will be beneficial to him In many situations this is 
not easy to do For instance, when service or efficiency ratings are 
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used by government agencies there is a marked tendency to rate 
nearly everyone considerably above average While this practice 
avoids the unpleasant feelings and the appeals which realistic ratings 
might evoke, by not pointing out needs for improvement, one mam 
purpose of the rating is defeated. 

RULES FOR RATERS 

The considerations just discussed suggest the following rules which 
should prove helpful in avoiding errors in using rating procedures* 

1. Each trait should be defined as clearly as possible, vague terms 
should be avoided. 

2. Each trait should refer to only one relatively independent pat- 
tern of behavior. 

3. Raters should judge on the basis of their actual experiences, 
avoiding prearranged schemes which may not apply to a particular 
situation. 

4. Complete integrity should be secured in rating either self or 
others. It may be an advantage to secure ratings from judges who 
are ignorant of the use that is to be made of them, and who do not 
know how the form will be scored 

5. Raters should frequently check their definitions and scales of 
value with the accepted criteria. 

6. The combined ratings of several equally competent judges 
should be used rather than the ratings of one judge. 

Good evaluations depend upon four complex conditions, oppor- 
tunity to observe, competence, willingness to report a fair rating, and 
the availability of an accurate evaluation procedure. 

COMBINATIONS OF RATINGS 

It has been shown that the most self-consistent ratings often come 
from combining the work of several raters Raters often seem to 
cancel out one another's idiosyncrasies The process of combining 
ratings is, however, frequently complicated by the fact that difierent 
raters vary in their leniency and in the scope of their judgments This 
is well demonstrated by Ulus 163. The median ranks and dispersions 
are different for each rater. The first nurse ranked the group nearly 
a whole step lower than did the second nurse. This discrepancy means 
that the ranks assigned by one are not comparable to those assigned 
by the other. Such differences as these are frequently found among 
raters. Training tends to reduce the differences, particularly if judges 
are told to use all the classes and to make their results fall in a 
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normal curve I’liere arc, houcvei, some ol)jections to this prorcduie 
It may lead ro artificialitv, and ihe piobabdiiy is that in many small 
samples the cuivc ol distribution is not a normal cui\e 

In situations -whcic ii is dcsuable to ha\e the le^ulis ol all rateis 
comparable it is possible to change all niio standaid scores Svnionds 
(1931, p SI) has laulitaicd such changes by giMng a table ol the per 
cents which ma) be expected horn 3- to 7-point stales, applied lo 
two groups, one ol -lO and one ol 185 persons Ho^vcver, Conrad (1932, 
1932 A) lound that ad|iistmcnts to a common cbstiibution did not 
significantly inciease the validity ol combined ratings in the Uiiifcd 
States Aiiny Man-to-Man scale (llJiis 161) nor in the laling ol in- 
telligenceol niiiscry sschool chilchcn Heconcludcd that m these situa- 
tions such nme-consLiming adjustments ivctc nor worth the trouble 
required When huge diirciences occur between two lateis, howe\ci, 
an adjustment that makes theii latings conipaiahlc may be advisable 

Another and more dilhcult pioblcm in combining ratings is that 
of securing equally good raters In many industiial situations it is 
impossible to find more than one rater who has had adeejuate op- 
portunity to observe the prisons to be latcd, and even when more 
than one judge is available, it is probable that they vary a great deal 
in their competency a^ latcrs Oltcn a good rating may be con- 
taminated by seveial poor ratings, because the validity ol the sejaa- 
rate rateis could not be determined, but no ciiterioii was available, 
to which the good laicr could appeal. 

VALIDITY OF RATINGS 

The desire to ha\c definite infoiination concerning the dcgiee to 
which ratings reveal the tiue sii nation has led to considerable le- 
search, chiefly along two lines One oi these correlates actual meas- 
ures with laiings, and is thereloie limiied to situations where actual 
measures are a\aila!dc I'he other procedure sets up some combina- 
tion of judgments as the neaiest available appioximation ol the 
truth and compaies othci judgments with this 

The first procecluie is uj>ecl in the work of Marsh and Perrin 
(1925), w'ho conclated latings W’lth various direct measures. The 
judges obsersecl students jDeriorniing certain tests, and then judged 
the perfoimance TJic coi relation between estimated and mcastiied 
aiming was only 36, between estimated and measured intelligence, 
.78, and between apjjraisals of caul sorting, 68 Ratings ol head size 
correlated 76 with measures ol head cnctimleieiice These fads 
indicate considcj able disciepancy and lead one to suspect that ratings 
of more intangible traits may be even less in keeping with the facts. 
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The second procedure is well illustrated by the work of Adams 
(1936), who also furnished formulas for showing certain qualities of 
a scale. In one of his experiments, on two different occasions a num- 
ber of persons ranked ten printed circles in order of size. The two 
rankings of each person were correlated to give a personal consistency. 
These correlations were averaged to give self-consistency for the 
whole group. Another figure called group-consistency was secured by 
correlating the rank order assigned by each person with the rank 
orders of all other persons. Since in practice this procedure involved 
too many calculations, a sample thought to be representative of all 
person-to-person correlations was taken. In this experiment and also 
in a large number of other similar experiments with objects, the self- 
consistency was found to be the same as the group-consistency. Both 
group-consistency and self-consistency were high or low depending 
upon the difficulty of the discrimination. 

In another experiment the same peisons ranked ten students on 
personality traits, for example, courage. The results showed that self- 
consistency was considerably higher than group-consistency. This 
finding indicated that persons were more consistent with their own 
biases than with those of others. The group-consistency indicated 
the amount of random error, whereas the self-consistency was raised 
by systematic errors of personal sorts. Adams, therefore, proposed an 
index of objectivity found by dividing group consistency (GC) by 
self-consistency (SC) If GC equaled .81 and SC equaled .90, then the 

GC 81 

equation from which the index is derived will be-^ = .90 zz: 

bCt .90 

objectivity index. 

When this index is 1.00, group- and self-consistency are identical, 
as in judging the size of circles. The judgments are considered to be 
completely objective even though they may not be accurate. Ob- 
jectivity is therefore defined as the lack of systematic errors in judg- 
ing. An objectivity index of less than 1 00 indicates systematic errors 
on the part of the judge. The lower the index, the lower is the ob- 
jectivity. This index is a useful one which should be applied to both 
rating and measuring techniques. 

In the experiment on judging sizes of circles there was an accepta- 
ble criterion for size, namely, the areas calculated from actual meas- 
ures. When there are no widely acceptable criteria available, as in 
the judgment of courage, tact, or artistic ability, then there is no 
possibility of a conclusive check on the degree to which a rating ap- 
proximates the truth. There are two procedures, however, which are 
designed to furnish criteria of approximate truth. In one the central 
tendency of a group of judges is taken as the best approximation. In 



TYPES OF ESTIMATES 


471 


the other the agiecnienl of sevoial siiiulai procedures is taken as 
evidence that all are yielding true appuusals 

I'he use ol a ccnttal tcndenc\ as the best appioximation ol a lact 
IS, ol coin sc, open to question If a nuiiibcT ol judges make the same 
erroi, the opinion C3l a single expert wdl be closer to the tiutli If, 
howcvci, dll judges have neail) equal ability, then a ccntial tendency 
has been shown in many cxjjcrnncnts to be ncruci the l.icf than the 
judgment ol any one ol the judges "Ihis discussion raises the \ci\ 
inteiesting question, who is the best judge- oi, in the case ol instiii* 
ments or tests w'huh is the leaat subject to ciiois- In sonic instances 
the cpialihcations ol the judges or ol the jjcisons w'ho constuicred the 
tests aie coiisidcied Thus an exj^ert in cheniisiiy would be con- 
siclcied a moie compcient judge of clicniical an.ilysis than an un- 
trained person In many othei cases, however, no accepted cjualilua- 
uons are available Although sonic woik has been done, lurthcr in- 
tensne reseaich is needed to show' the extent to which accuiacv ol 
a paiiicular judgment is related to jieisonal chaiacieiistics Holling- 
worth (1911), flollnian (1923) Shen (1925), Adams (1927), Coniad 
(1032), and others have lepoitccl that scll-ratings are usually too high 
on desirable traits and loo low on undesirable traits, and that stipe- 
1 lor indiv iduals oltcii undei estimate themselves and the infcnoi over- 
laie thenibchcs Adams lound that a good judge of sell is likel\ to be 
more intelligent, observing sympathetic, geneious couiagcous, and 
happici than a good judge of others Hollingwoith believed that tlieic 
was a positive coiielation between possession oi a desnable trait and 
ability to estimate ii in others 

The other approach to tlie cMteiion lor validity is made by using 
four ratings all erf which ai‘e designed to apjiraise the same quality 
First, one must show that ihe appiaisals are liee lioiu systematic 
ciior, Iry mcihods described lor sedU'ing objectivity Next, as Adams 
(1936) has shown by Spearniaifs logic, the loin appiaisals may all be 
considered to mcasinc the same liinction li togethei they satisly the 

formula — 7=^^^ = 1 00. Here il a correlation between raters a 

1*1 il'Sb 

and h js equal to the square loot of the product of then icliability 
coelhcients, the validity index is 1 00 For example, if on two oc- 
casions pei'sons o and h ranked ten photographs lor their advertising 
appeal, and il the follov\ing correlations were lound 1^.1, .60; ri,,„ 30, 
and r^i,, 42, then, by sulrstitution of these values m the formula, 
42 

we have — = 99 In this case Adams would conclude that 

V60X 30 

both persons were using the same set of standards in rating the photo- 
graphs, even though one rater was much more consistent than the 
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other* If the index is much less than LOO, the raters are not using 
the same standards. 

Applications of these procedures for studying validity are noted 
in the subsequent chapters. 

STUDY GUIDE QUESTIONS 

1 What are the processes involved in making a numerical rating? How 
do they differ from those involved in securing a test score? 

2 How are inventories prepared to yield unequivocal scores? 

3 What is the procedure in the paired-comparisons method? What ad- 
vantage has it? 

4 What is the rank-order method? Does it depend upon a normal dis- 
tribution of persons in the group? 

5. How can a scale of attitude toward such an institution as the church 
be made to have equal steps, that is, equally often noticed differences be- 
tween the steps? 

6. How are rating scales prepared? 

7 How were the forced-choice items prepared for the United States 
army officer rating procedure? 

8. How may the relative intangibility of various items be determined? 

9. How may the optimum number of steps to use in a graphic scale be 
determined? 

10 What is the halo error in rating? How may it be avoided? 

11. What can be done to reduce unwanted fear or emotional reactions to 
ratings^ 

12. How can intentional misrepresentation be reduced? 

13. What advantages and what difficulties are there in combining ratings 
from various judges? 

14. What are the principal methods of rating or estimating personality 
traits? What aspect of comparison does each stress? What are the advantages 
of each? 

15. What are the advantages and disadvantages of the forced-choice 

method? * 

16. What evidence is there of emotional reactions to specific items or to 
questionnaires as a whole? 



CHAPTER XVII 


DRAWING, PAINTING, 
AND HANDWRITING 


INTRODUCTION 

Broadly defined, 'visual-motor behavior consists of a laigc \aricfy oi 
visual peicopuoiis followed b\ cxpie^sive niovemeiirs. All techniques 
discussed in this chapter have in common the sensorv-motoi patterns 
involved in making maiks on paper I hev chfTei iiom each either 
with regaid to lesirictions o( content, the use of media and tools, 
and the adinmistiation and scoring proceclines Difleient clegiccs ot 
encouragemeiU arc eiiijiloyccl and sonic piocediiics employ a con- 
siderable amount ol iccording of vcihal comment's oi stories Evalua- 
tions of intellectual growth and ol aitistic design in drawings aic 
discussed in Chaptcis VllI and X Chaiacter oi cl;^namic evaluations 
are discussed in this chapter. 


VISUAL-IMOTOR GESTALT 

Drawing oi copying icsis have often been used to appraise not 
only the normal maturation of visual and motoi function's, but also 
variations associated with mental defect and peisonal ad|iisLnicm or 
integration The work of LauicUa Bcridci (1038) is w’cll known with 
regard to tlie Jattci Over a perioct oi 20 years Miss Bender lollowed 
and supplemented the woik oi Wertheimer (1923), Kohlci (1929), 
Schilder (1931), and Koflka (103?) who cmphabi/cd ilic dviiamic 
aspects of perceiving and Lindeistaiiding The act of ro]?ying a pat- 
tern is a complex one in which the conipiehcri'sion of directioris, 
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ILLUS 166 BENDER GESTALT FIGURES 



(Gestalt drawings after Wertheimer, 1925, furnished through the courtesy of 
Dr Max L Hutt In the test situation the figures are shown one after another 
and without numbeis) 

willingness to cooperate, sensory-motor mechanisms, and personal 
motives and integration all have a part Two opposing tendencies 
^e always thought to be present* an integrative one, which results 
in good coordination and complete, accurate drawings, and a dis- 
integrative one, which results in patterns that are simplified, warped, 
elaborated upon, made fragmentary, or destroyed A drawing always 
represents a momentary equilibrium between various forces, but it 
also frequently indicates fairly stable modes of adjustment 
Bender used nine of the thirty drawings developed by Wertheimer 
(Ulus. 166) in her studies of normal and abnormal persons. The 
first design (marked A) is usually perceived in horizontal sequence 
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and represents two contiguous figures, each of which is known as a 
“good gestalt" or complete figure. The figures marked 1, 2, and 3 
consist of dots or small loops without boundaries The patterns are 
determined by the shortest distances between i:>arts. Figures 4, 5, and 
6 present difficulties in organization by being partly open and 
requiring careful perception and drawing of points of contact be- 
tween patterns which are not usually percei\ed as wholes. The last 
two figures are combinations of two closed polygons which involve 
conflict and various perceptual relationships. 

Administration (Bender) 

The designs are printed on separate cards and presented in order 
with informal directions, such as, “I am going to show you some 
cards one at a time. Each has some simple figures on it Copy the 
figures as well as you can. This is not a test of artistic ability. If you 
have any questions feel free to ask them " Any questions concerning 
time allowed, size of drawings, number of sheets to use, are answered 
by saying, “Thats up to you Theic are no iiiles " Blank sheets of 
8|4‘ f>y 11-inch paper aie inrnislied, with a '.ok pencil and eiasei. 
Figuics fiom all the caids mav be copied on the same ])iecc ol paper 
or additional sliceis may be liirnishecl jI ihc subject widics There is 
no time limit Hull (19-10) asks the subject to elaborate the din^Mngs 
after he has completed all of them He also includes an jnc|iury 
period in ^vhich the subject is asked, “What could it bc^" and a 
testing-the-1 units pciiod when the perceptual and integiative pi'oc- 
esses aie m\estigatcd. 

Interpretation 

Behavioi dining the course ol the test is obscr^c'cl, and pcitinent 
oral responses uie wuttcii clown. No stand. iicl niinieiical scoies aic 
available, but diawungs are evaluated l)y signs or jiatterns which are 
compaied with those which have jiieviotisly been made b\ knowm 
groups of pci sons 

Hutt (1919) jDOints out ihat in comjfleting the Bcndci-fk^stalt Test 
or any ch awning test theie are usually ioui sLcjxs wdiich olten, though 
not always iollow a definite temporal sequence 

1 MoUoatwn. In this step the subject is told about what is 
expected on ihc test He .igices to cooperate, pioliably w'lih some 
reservations, and observes the fust caid 

2 Selective altcntiori and lyricrption 1 he subject then cxjilores 
the card \istiall) and selects the aspects to be rejn'ocluccd. The selec- 
tion IS panly controlled by one s needs 

3. Moio't iespo?i^e In this step comphcaied visual-motor behav- 
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lOT produces marks on paper, and sometimes speech. The movements 
are usually independent of educational level, except for training m 
drawing. Symbols are seldom inhibited unless the subject knows their 
social significance. 

4. Reaction to motor response. In this step there may be a 
fantasy, or a period of autocriticism, when the subject compares his 
drawing with the stimulus, or he may strive to complete what seems 
to be incomplete. During this period unconscious needs often ap- 
pear because, in the drawing process, the ego is less likely to be in- 
hibited than during speech However, the subject may be aware of 
some aspects of his conflict and may show emotional behavior, such 
as postural and vasomotor changes 

Hutt (1949) has listed four determinants which will be briefly 
suiveyed heie to show their nature and complexity: 

Organization 

a. Order and sequence of placing drawings on a page range from 
very iigid through regular, irregular, to chaotic. Egocentric persons 
have been found to start in the middle of a page and to use several 
pages Insecure persons often start in the upper left corner. Normal 
folks usually begin near the middle or left of middle at the top. 

b Normal subjects usually use a page or page and a half. Paranoid 
reactions to protect self from a hostile world often result in the use 
of a small part of one page. Neurotic tendencies are often shown in 
large amounts of empty space and in small figures. Compulsive trends 
are related to the use of margins either as a guide or for support. Com- 
pactness of figures and collisions between figures are related to de- 
pendency and poor planning 

Size 

Size is related to anxiety Marked deviations in size, especially 
smaller figures than the originals, or less frequently larger, indicate 
insecurity Often the disturbing part of the figure is out of propor- 
tion to the rest of it. A progressive increase in size is related to re- 
lease of tension or to the development of greater assertiveness during 
the test, and a progressive decrease to feelings of failure and defense 
needs. Fear of authority tends to reduce vertical dimensions and in- 
crease relative horizontal dimensions. 

Changes in Form 

In general poor closure and organization and specific symbols are 
related to anxiety and dissociation. Difficulty with drawing curves 
reveals several tendencies. Thus, poor social adaptability is often 



DRAWING, PAINTING, AND HANDWRITING ill 

related to greater angularity and poor balance between impulses and 
controls to gicar iricgulaiity Oiganic biaiii damage is lelatcd to de- 
creases in angularity and poor intcgiation oi simplification oL pat- 
tern in o\erIappmg figuies, paitirularly where angles appear. 

Distortions of Gestalt 

The lotation ot paitial lotation ol figures beyond appioximately 
10 degrees eithei way is loiind most Irequcntls among schi/ophrenics 
The oiigiiial gestalt ol the diawing is Iiequciniv lost Those w'lili 
organic lesions pioduce partial lotatjon of figuics more olteii ihan 
full rotation or reversal Rotation usually indicates dcgiccs ot dis- 
orientation 

Among cases of seiious regicssion dots arc convened into loops oi 
sometimes into vcuical or ncaily vcitical sciibblcs. Rows ol dots be- 
come W’avy lines, and pci sevei.it ions fiom one figure to anorhci a|> 
pear. 'I’hcse may be clue either to inaccinatc peiception or to an 
attempt to sinqilify the task They may be due to damage, especially 
in the parietal-temporal regions ot the brain 

Doodling, claboiations, perseverations, and artistic sketching are 
often iclaied to various tensions and can be explained only by as- 
sociatne techniques Boundary lines and svminciry olten lelate to 
arbitral y restrictions in hie space. 

In addition to these determinants Hutt also rctoi’ds type and di- 
rection ol movements and methods oL woik il inovenienis are related 
either to great tension or to flaccidiL), the reasons should be explored 
They aic oiten related to ego stiivings Movements avva) liom the 
body aie usually indicative of aggression or repulsion, while move- 
ments toward the body indicate passive states Movements in veitical 
planes reveal sonicthing ol one's leaction to authoiit), and in latcial 
direction to contacts with peeis oi to mierpersonaJ relations in gen- 
eral. Diagonal tieiids on a page sometimes lesult from indecision or 
poor solution of conflicts 

Methods ol woik vary fiom extremes ot piccisc deiailing ol a 
compulsive kind to impulsiv^e motor lesponscs which have little con- 
nection with the figuie \ noimal person usually draws without 
much speed, Irequeutly stopping to look at the model in older to 
correct Ins woik oi to anticipate what comes next Seiting up guide 
lines, counting dots and measuring distant es are done in excess by 
compulsive and rigidly conn oiled intUvidiials 

This discussion ot determinants does not include fantasy and 
symbols winch aic olten impoitant in peisonality diagnosis '^Ihese 
are introduced in Chapter XVTTT Xo significant bod) ol noima- 
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tive data is as yet available to indicate frequency of patterns and their 
relationship in various syndromes. This extremely important field of 
research is now being actively investigated. 

Stages of Maturation 

Bender (1938) found fairly clear age norms from four to ten years, 
which she presented with charts together with percentages of chil- 
dren at each age who succeeded. Her findings are briefly summarized 
here. 

When children below the age of three are asked to copy drawings, 
they normally scribble with large arm movements which result in 
whirls or pendulum waves. Dots are made by heavy punching move- 
ments. The results are not meaningful pictures, but the products of 
motor expression. 

At four years some inhibition of movement occurs which often 
results in smaller single or concentric loops or circles Patterns are 
made by combining these loops and making them wider or higher 
to resemble the exposed pattern Dotted forms are usually repro- 
duced in the form of curved figures, and there is much motor persev- 
eration both in number of strokes and in pattern. The first pattern 
produced may be given for all the rest or influence the rest. Among 
right-handed children there is a marked tendency to draw from left 
to right, and the opposite is true of the left-handed. Contiguous or 
overlapping objects are often seen and drawn separately 

Between four and seven years there is rapid improvement in form. 
At five years short vertical and horizontal lines are managed, but 
diagonal or slanting lines are difficult. Thus the square in Ulus. 166 
IS drawn on its side Dots are usually drawn as small loops Con- 
tiguous or overlapping forms, as shown in Items A, 4, 5, 6, and 7, are 
drawn as separate objects. The figures are usually arranged on the 
page to follow large concentric circles which correspond to arm 
movements. • 

At the six-year level, a real diamond appears in the drawing (Ulus. 
166 A) but with irregular sides, usually a little curved. In Item 1 
dots are made as dots, or as very small circles. There is still a tendency 
to separate contiguous or overlapping figures, but most of these pat- 
terns show correct contacts. 

At seven years the child still has difficulty in producing slanting 
angles as is shown in 2 and 3, and producing lines, as shown in 7. 

The diagonal slants are all well handled by 60 per cent of the ten- 
year-old group, the detail forms are correct, and there is a marked 
tendenc)' to count the dots. Adults make only slightly more precise 
drawings 
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Typical Deviates 

Bender points out that clinical classifications hardly ever represent 
pure types, but that wide variations in personality structure occur in 
every group In using the Bender-Gestalt Tests she finds it is desirable 
to deteiinine an age level Irom the gestalt drawings and also irom a 
Binet-type test 'Wlien the level of mental development is known, 
then the effects of conflict or injury can be seen Bender's (1938) 
findings with de\iate groups, which are typical of the reports of 
others, are summarized here. 

Mental defect Among those with mental defects, drawings are 
fairly typical of their mental-age level. However, motor maturity is 
greater than for the normal of the same mental age, hence the draw- 
ing may be more definitely controlled. These mentally deficient also 
often show patterns which are typical of aphasic, schizoid, or con- 
fused states, indicating that mental retardation is often complicated 
with emotional disturbances. 

Brain tnpiries In the case of violent brain injuries or internal 
bleeding, as in arteriosclerosis, there is first a confused state in which 
perception and draAving responses arc vctv^ difficult. Later, as the 
patient impjoves, the diawjngs may jnduaie which parts ol the 
brain ha\e sulleied the most iiijui) Bender icportcd eight cases 
wheic sciisoiy aphasia -vxas a pionuneiii symptom and oihcis \shere 
various in|uries had been caused by svplijliric inlecrion, alcohol, 
and trauma. It is possible that lesions ol basal ganglia cause reduc- 
iions in the numbci ol elements oi diawings and also Iiagmentatioii 
at points not related to the maturation pnnciples oi gestalt theory 
In]uries ol the coite\ arc often related to the difficulties in integia- 
tion ol parrel ns Koisakow psychoses with lew oiganic leatines show' 
many conlabulations with past memories, as well as i eversion to 
jirimiLive lesponses, and disorientation, but the essential designs are 
maintained 

Functional difficulties Bendci found catatonic schizophienic 
patients showed maikcd tendencies to revert towaul moic primitive 
or elementary' types, but in doing so to express change in rate of 
movement oi dircciion in pails ol the patterns This olten causes 
extreme exaggeration or disregard of the inheient gestalt In addi- 
tion, theie IS often much pcrscvciation, and the original stimulus 
may be lost in elaborations ol innei impulses to establish one's owm 
idcntiiy. 

In mild manic states there are rapid attempts at caxelul reproduc- 
tion, often With erastiies and with expressed feelings of satisfaction. 
Often embellishments are rapidly added, w'hicli do not dcstioy the 
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original figure, but represent a flight of ideas. These frequently in- 
corporate local present details. Quick verbal associations usually run 
ahead of the drawing. 

In mild depressed states the same type of response is given but with 
less satisfaction and more slowly. Inhibitions may reduce the de- 
tails of small dots and circles. Sometimes negative or shadow images 
are used. 

Lying, malingering, and the so-called Ganzer syndrome, where the 
client answers simple questions with understanding but foolishly, 
are usually reflected in drawings which in some systematic way alter 
the patterns. Such alteration could be made only with perception of 
the true pattern. Usually the true mental level is indicated and the 
elaborations are trivial. 

Motor Gestalt (Mira) 

A pencil-drawing test which nearly eliminates the visual control 
by using a blindfold or a screen, has been described by Emilio Mira 
(1940), who wished to measure the degree to which movements were 
varied by imagery and posture. The subject is seated comfortably so 
that his body and face point directly at an 8%- by 11-inch sheet, 
fastened to a drawing board. In the first part of the test the examiner 
first demonstrates by drawing a horizontal 5-cm line from left to right 
with the right hand, and two similar lines just beneath it. (The wrist 
is not allowed to touch the table at any time during tlie test.) The 
subject is then blindfolded and the examiner guides his hand to the 
starting point on the paper. After drawing ten lines from left to 
right, the subject is given a fresh space on the paper and asked to 
draw ten more similar lines from right to left with the right hand. 
Then the left hand is used to make two similar sets of lines. When 
this is finished the subject first makes ten lines perpendicular to the 
bottom of the sheet moving away from the body and ten moving to- 
ward the body with the right hand, and then similar sets with the left 
hand. After a short rest the experimenter demonstrates and the sub- 
ject begins the second part of the test. 

In Part II the subject is not blindfolded, but a screen is used to 
prevent visual control. He draws the following: 

a. Zig-zag lines an inch in length with angles of about 10 degrees, 
with both hands simultaneously, beginning at the top of the page and 
working down to the middle, then reversing direction and working 
upward from the middle. 

b Chains of separate circles from right to left, and in 

reverse direction with each hand separately. 
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r A sum rase 'wilh steps V^o of jn incli in height, moving cluig- 
onally up and then down ^\uh each hand sepaiafely 

d A patici n similar !o the tiiiret of a castle, di awing honzontally 
in each direction with catli hand scpaiately 

jTJi_n_rL 

The scoung oL the lesults in Pan I is objectne, (or Mira measures 
the absolute and lelatnc lengths ol lines loi each hand, computes 
aveiages, \ai lability tiends tow’aul longer or shoirer lines than those 
ill the model, shills m diicction, and a coeflu leiit ol cohcieiire, which 
is the a\ciagc lelatuc shilling di\idcd by the a\eiage absolute shilt- 
irig Shilt.s in diicclion aic found by nicasuiing the distance ot the 
midpoint oi each line to a peipendicular Iiom the midpoint ol the 
fust line Positne values are assigned to shilis in the diiecticm that 
the lines are diawai Thus if nine lines had as shift distances in milli- 
meteis 0, — T, — 2, — 2, -f-1, -|-3, -|-3, +1, +5, they w'ould have an 
absolute shift of 21, a relative sliilt of -r-ll, and a coefhcient of co- 
herence of ■‘523 The scoie ol Pait IJ is iimre subjectne, lor the 
straightness ol lines, then picssure oi load, and their oiicntatiou are 
consider cd Meticulousness, impetuousness, and failuie^ or emotional 
blockings, and other charactci istics ol the method of thawing are 
observed 

Mira’s results, as yet tentative, show letcsr reliability on scoies in 
Part 1 in the ncighboiliood of 80 among thiit>-five iioimal adults 
Among nglit-liaiided peisons the iight-haud stoics aie slightly less 
ichalrlc than the lelt-handed There is CMclence that the iight-haiicl 
behavioi is moie iiinucnced by intellectual activity and the left moie 
by muscular and emotional constitution. Hepiessccl pci sons show 
more downwaid tendencies with die Iclt hancl than v\ith the light 
In general, vertical shiftings aic related to ascendant or withdi awing 
tendcncie:s, and vertical length to amount or strength ol activity 
Aina offers no simple tlieoiv iclatcd to hoi izontal sliiltings 
Among the clinical gionps the schizophieincs showed unusual 
tendencies lo lose oiiginal dnection, to icvcisc direction, and to fail 
in drawing patterns in J^'ul II As was expected elation and depres- 
sion weie I'elated to both length and shift Data ironi other gioups 
are being collected 'Ihc Mna test has inicicsting possibiliiics be- 
cause It sli'esses basic ]DC)stuial movements and yields some results 
vshicli can be scored objectively It is simple lo adniinistei and score 
It should give inleiesling results w'lth childi'cn and v\ ith other clinical 
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groups besides the schizophrenics, particularly those with organic 
brain damage. 

Mosaic Tests 

Among the tests designed to reveal methods of organization of 
materials, the Lowenfeld Mosaic Test is a good example. The Mosaic 
Test as used in this country is described by Dimond and Schmale 
(1944). The materials used are altogether 130 pieces — ^squares, dia- 
monds, right-triangles, rectangles, and isosceles triangles The colors 
black, white, red, blue, green, and yellow are so used that there are 
not more than ten shapes of any one color and shape The squares 
are 1 inch square, and the other shapes nearly the same size, so that 
they can be made to fit together nicely All blocks are % ^ of an inch 
thick. A wooden tray, 18 by 26 inches, is also used for die test 

The instructions state, “Make anything you like out of the pieces,” 
but in practice it was found that nearly one third of the subjects 
mildly or strongly disliked the patterns they produced. Detailed notes 
of the subject’s attitude, manner of selecting and placing the pieces, 
and verbalizations were used in rating the following nine items, (1) 
ideation or ability to think up patterns, (2) cooperation, (3) attention, 
(4) anxiety (specific or not to the test situation), (5) carefulness in 
selection, (6) carefulness in placing, (7) persistence, (8) manner of 
completion, and (9) approval of results. Dimond and Schmale finally 
distinguished five patterns of behavior — ^normal, mildly defective, 
moderately defective, severely defective, and unclassified They found 
certain results to be typical of mental disease. Thus, the psycho tics 
were usually cooperative while the psychoneurotics were often unin- 
terested, evasive, and resistant. The schizophrenics showed disregard 
of color or acUve color rejection by using only black and white They 
made literal or bizarre configurations with precise symmetry, but 
often showed blocking and incompletions, and severely defective 
gestalt. The psychoneurotics had many variations similar to normal 
and usually made normal or mildly defective gestalt. 

This type of test covers somewhat the same areas of perception and 
color reaction as the Rorschach, but requires in addition constructive 
planning and movements toward completing the plan. A good deal 
of research is now going on which will allow much more important 
interpretations to be made with more confidence. 

DRAWING OF OBJECTS 

The use of freehand drawings as indications of personal dynamics 
is old in practice, since artists and art critics have always emphasized 
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that art can he and usually is an oxpiession of one’s interests, safisfar- 
tions oi learb, conscious or unconscious Technual articles on chil- 
clien’s drawings by Rariics (1893), Biiik (1902), and a number of 
others have desciihed in detail neaily all of the categoues used at 
present. 

Freehand diawings or paiiuings vicld extieineh complicated re- 
sults and hence have, lo cUite, defied the compTehensive aiitilysis oi 
Lictois and ^}nthcsis oi findings which aie tvj>ical ol st.indanl tests 
of nuinbci alnlity Ncvci thelcss as \ct lelatnely liitle research has 
been clone with icgaid to chawing when compaied with number 
skills, and it is highly piobablc that manv fairh definite clMiaiiuc 
patterns in draw'ing will be foiiiicl -Ml of the analyses used in the 
Rcndei -Gestalt I’est seem apj^hcahle to ])encil thawings and most of 
them to brush and fingei painting In addition drawings call lor 
ouginal composi lions, so that the »iib|ect’s rnenial conrent is given 
opportunity for gi eater expiession Drawings aie thought by manv 
to be more diiecrly expressive ol deep or unconscious clesiies than 
wTilieii Ol spoken language, bet a use thawings aie not a common 
method of coiniminication, and have nioic svnibolic content which 
is not rccogni/ed b> the subject and hence not censoied Spac'c is al- 
lowed heie for only tw'o samples of analvtical procccluics and iiitcr- 
pi eta Lion 

Karen Machover (lO'IO) destiibecl ,i tcchnujiie for seeming and in- 
terpicting cliawmgs ol persons She simply picsenu the subject w'lth 
a white 8J,j- by Il-inch sheet and a medium-soft lead jicncil and re- 
quests him to “Draw^ a person,” or, in the case ol young children, 
‘‘Diaw» someboth ” During the tli awing careful obscivauons are 
made and iccoided of the subject’s questions and comments, the 
time used, and the sequence oi the parts drawn. W'hen one chawing 
IS conijdcte, the subject is given another sheet and asked to draw a 
])i<ture of a poison of the sex not leprcscnted in the fust If there is 
lime loi only one drawing, it is prelciable lo liavc the subject chaw 
a figure of the Soiiic sex as himself Resistance may need to be over- 
come by stating “1 his has notiimg to do w’lth youi ability to draw. 
I’m inLeiestcd in bow you iiv to diav\ a peisoii ” li an mipoi rant pait 
IS omitted, the subject may be uigccl to diaw it I'he two drawings 
usually lequirc less than 20 minutes 

In Older to gam ins'ght into structural weaknesses and conflict an 
inc[Liii) periocl of irom 10 to 20 minuies is used 1 he subject is told, 
in language aj^jiiopnate to hi^s age, “Let’s make up a stoiy about the 
pel son as iL he wcic a cliaracici in a novel oi a jalay ” Resistance may 
be overcome in various wavs, such as by asking, “Hcjw old is the 
person^ Is he mameep What gets him angry^ What is the best pait 
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of the body? The worst^'* Twenty-four questions of this sort have 
been listed on a standard record sheet. The subject is further asked 
if the figure reminds him of anyone in particular, and which of his 
statements refers to himself as well as to the picture, and to explain 
unusual details in the picture. 

In interpreting the results, Machover finds much evidence that the 
subject projects his own characteristics and some of his conflicts into 
the drawing. For instance (page 31), she finds: 

The size of the figure, where it is placed on the sheet, the rapidity of 
graphic movement, the pressure, the solidarity and variability of the line 
used, the succession of parts drawn, the stance, the use of background and 
grounding effects, the extension of arms toward the body or aw^ay from it, 
the spontaneity or rigidity, whether the figure is drawn profile or front 
view are all pertinent aspects of the subject’s self-presentation. 

In interpretation the proportions of the body, shading, detailing, 
incompletions, erasures, line changes, symmetry, and mood expressed 
in the face or in the postural tone of the figure are all given considera- 
tion. The significance of variations in space, line, and proportions 
is thought to be similar to that reported for the Bender-Gestalt Test. 
Machover also gives 70 pages of principles of interpretation covering 
in detail the head, hair, features, neck, extremities, trunk, breast, 
shoulders, hips, clothing, movement, conflict indicators, and devel- 
opmental considerations. Part of the interpretation of the head (page 
36) is: 

The head is the important center for the location of “self.” Heads gen- 
erally receive emphasis, except in drawings of neurotic, depressed, or so- 
cially withdrawn individuals. The head is essentially the center for intel- 
lectual power, social dominance, and control of body impulses. It is the 
only part of the body which is consistently exposed to view, thus being in- 
volved in the functions of social relationships. . . The obsessive-compul- 

sive will frequently give an almost ape-like presentation of physical power 
in the figure he draws, while underplaying the head. In this instance the 
head is definitely considered to be the organ responsible for his conflict con- 
cerning free expression of his impulses. 

Disproportionately large heads will often be given by people suffering 
from organic brain disease, those who have been subjected to brain surgery, 
and those who have been preoccupied with headaches or other speaal head 
sensitivity . . 

. . . The youngster whose emotional and social adjustments have been 
dislocated because of a severe reading or other subject disability will fre- 
quently draw a large head in his figure. , , . The mental defective will for 
similar reasons often give a large head. The paranoid, narcissistic, intellec- 
tually righteous, and vain individual may draw a large head as an expression 
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of his infljiecl ego . riu* sex gi\cn the pioporiionarcly larger head is 
the sc\ ihat is accorded moie inlcliectual and social autlioiitv 

II is noL aliogethcr dear why a }ouiig child of three or four will often 
cliiiw .1 laigc head pcihaps with appendages issuing from it, as a com- 
pleted icprescniation oi a jieison It may be spcdilaiccl that, since locomo- 
tion and manual exploration ol the ensiionment (and of the clnlcr'j own 
bodv) are nnpoitant leatines of a child’s cail\ dctelopincni, the appcaiance 
of legs or arms beloie the bods is limciionalls comprehensible It is 

with the head that surrounding adults smile, appro\e, Irown or scold T he 
head ol the adult is the most important oigan relating to emotional matuntv 
of the child Pcihaps the large head that clepcndent male adults gi\e to the 
Icmale figure in their di.iwmgs rejne'.euts an emotional fixation on a sup- 
porting rnother-imagc similar to that cxpeiienced by the child as a normal 
phase in its cleselopment 

Girls aie said to draw laigci heads, shorter arms, sin.dlcr hands, 
shorter legs and smaller leet than boys do W'hilc girls ha\c only to be 

pictt) and clccoiatne to cornmind social attention, boss aie expected to 
make rapid stiidcs in the desclopmciit of physical and sexual pow'er, in 
proficieni) in athletics, to reach out into the envnonment moie vigorously, 
and to show more t iiigiblc accoinplishmeni . . . 

Machovei gives eight illiisliartve case studies with clinical histones 
and interpreiations ol drawings The inter pi etations show the i cla- 
rions between the clrinving, the associations given by the subject, and 
the clinical history Xo attcmjji has been made as yet to develop a 
caiefully defined set of diagnostic signs such as those used in the 
Rorschach Test, hut the inarenal and the method lend themselves 
to similar systematic analysis and iccording Macliovci considers hci 
inonogiaph only the beginning ol a laigcr and riioic complex proj- 
ect 

Paula Elkisch (1945) issued a monograph in which the scoring of 
the Diaw-A-i\laii Icsi is divided into lour sections These aie sum- 
mar i/ed here 

1 Rhythm and its opposite, )ule or iigidity, are indicated by 
variations in the flexible quality oL the stroke, elasticity, and spon- 
tarieitv In iigidity llicic are tight spasmodic nio\enicnis which seem 
automaLic or mechanical 

2. Comphi\,ily and sirnpJexjty are expressed through teiideiicjc^> 
toward complete representation of individual diffeieiiccs In simplcx- 
ity thcic IS a lack ol cliffcicntiaiion indicaung to Tlkisch a lack ol 
a bill tv to detach oneself 

3 Iwpmisuw and iis opposite, (omp)css}n}} Expansion is seen 
through the widening ol the space used, in creai ions ol spacious back- 
grounds, and by a well-Joimcd pi escii tatiori which uses all the sjiace 
available Compression is revealed by the meticulous and fiugal use 
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of the space at the drawer's disposal. Expansion stands for potential 
ability to make contacts. 

4. Integration and its opposite, disintegration, are indicated by the 
degree to which the whole drawing shows relationship between its 
parts. Here there must be an essential center theme with other themes 
supplanting or contributing to it The lack of integration is seen 
when objects are piecemeal, broken, or contaminated by two or more 
things overlapping or crowding each other, and when there is a lack 
of center or any central theme. 

In addition to these four types of scores Elkisch evaluates the symbols 
from drawings according to psychoanalytical theory. 

PAINTING 

Painting is similar to pencil drawings in many respects, but it adds 
two complexities to an already complex dynamic pattern One of 
these is the use of color and the other the use of a variety of tools for 
applying the color Since the special techniques of palette knife and 
brush used in oil painting are rarely used in clinics and ordinary 
schools, nearly all the reports dealing with diagnostic or remedial 
painting describe the use of inexpensive colors that are carried well 
in water. Two different procedures are widely followed. One uses 
glossy paper and the fingers or hand to spread the paint. The other 
uses a rough paper and medium-sized brushes. 

Finger painting is doubtless one of the oldest forms of art, but its 
modern form owes much to Ruth F. Shaw (1934). It has been used a 
great deal for therapy, since it releases tensions, elicits spontaneous 
fantasy material, and yields a permanent record of growth in adjust- 
ment A good illustration of its use is that of Napoli (1946). In the 
procedure he uses the examiner first demonstrates the preparation 
of the paper, the selection of colors, getting seated comfortably, paint- 
ing, and engages in a patter of comments on what he is doing while 
portraying a story. The story is very important in securing rapport, 
and later in making interpretations. Then putting the picture to dry, 
putting the paints away and washing one's hands in a bucket of 
water is demonstrated These activities are all carefully observed, and 
what is said is recorded by notes or mechanically No time or motion 
patterns have been presented statistically or quantitatively, but 
Napoli notes carefully the posture, types of movement, position of 
first daub of paint, the use of space, order of procedure, the parts 
of the hand used, the colors and amounts of colors taken, the devel- 
opment of a plan, span of interest, and satisfaction with the end 
product. Napoli has laid a good analytical foundation for the estab- 
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hshment of norms and the synthesis of complex patterns. For in- 
stance, the parts of the hand used aie described (p. 161) roughly as 
follows. 

1. Whole hand flat and relaxed 

2 Flat palm with fingers raised 

3 Lateral aspect of hand with fingers extended 

4 Clenched fist with thumbs up 

5. Outer side of thumb with fiiigeis raised 

6 Ihise of til limb with ^^^ger^ laised 

7 Base ol palm with rest of hand ra ised 

8. Knuckles in not too cornloi table* pO'>ilion 

9. flat part o( finger or fingers relaxed 

10 Finger ups 

11. fingernails 

12 Whole aim including ^vi ist lelaxed 

13. Fleshy pait of aim with 'imisL raised 

The fict]iicnry with which each mode is used by noimal and by 
clinical groups, and the reasons tor tlic use will be an exticmcly in- 
teresting stucly Napoli has recoxdccl some obsei various aheady. Thus 
picking or “teasing” the paint was related to oial eioric oi mastiiiba- 
tory tensions Pressure with the palm of the liancl and fingci's up often 
indicated impulsive urges Exclusively using tips ot the fingcis often 
went VMth unusual Icai of being soiled, and the lateral side ol the 
hand, with leclings ol iniciiority 

Napoli leported typical characteiistics for sdii/ophrcnic, paianoid, 
and unstable patients, which indicate dynamic patterns lluis the 
schizophienics invariably showTcl two or more siiata, illogirally re- 
lated or disoriented, usually svinbolizing aspects ot conflicting aieas 
in the person The paiaiioicl had a cerilial figure vmlIi vvell-mtegiated 
objects on all sides for the puiposc ol protecting the central figure 
Some form of violent attack was usually louiicl either in the drawing 
or in the tonciirrenr verbalization 

Rose Alsliulcr and LaBcita Weiss Tlattwick (1917) published a re- 
port of drawings, daily observations, and case studies ot one liiinclrcd 
and fifty chilclrcn, ages from two to five years, coveiing an entire 
school year. Thev reported that in general these children expressed 
the same piiticrns in overt social bchavror that they showed in ciea- 
tive media, but some cliildien showed iheii feelings much more 
clearly in paintings than in oven behavior 'Ihey coiiipaied the 
children’s pielerences loi biush, huger painting, crayons, clay, blocks, 
and dramatic play, and Jound that in the nuiseiy schools sliidied the 
brush or easel painting v\as used more frequently as a means ol self- 
expression than any oihei media and that it prov .ded a medium lor 



488 


DYNAMIC PATTERNS 


observing diverse patterns and subtle variations Children's choice 
when they were alone often differed from those made when they were 
in a group, but the group situations where principally those re- 
ported. The authors found fairly strong evidence that 

a. Children who predominantly sought easel painting were more 
concerned with self and internal problems than the rest They were 
among the least mature in the group, they came from homes which 
exerted too much control; and they were preoccupied with emotional 
conflicts 

b. Children who preferred crayons tended as a group to be more 
concerned with expressing ideas than with finding emotional out- 
lets. They showed more awareness of environment and a drive to 
control It. All of them came from homes where they were exposed to 
high adult standards and lacked opportunities to function on their 
own levels of readiness. They were more tense and unhappy, and the 
crayon activity did not seem to provide a release Often children who 
sought crayons when they were new to the nursery school situation 
turned to easel painting as they became freer in behavior. 

c. Children who preferred blocks stood out in the group for their 
spontaneous, outgoing, adaptive behavior. Blocks have little or no 
color, but require much aligning, fitting, and definite structuring. 
Por some the manipulation of blocks provided a transition from 
impulsive action to discovering and interrelating the facts in the 
world about them When they did paint they produced highly struc- 
tured patterns, angular strokes, and enclosures 

d. Children who preferred clay, or any children working with clay, 
tended to talk about and to represent emotional problems related to 
excrements and sex rather freely. The children who worked with clay 
usually were grouped around a small table, with three or four other 
children near by. 

e. The children who preferred dramatic play showed less de- 
pendence on materials and more reliance on interaction with other 
children, or on monologues and pantomimes with imaginative con- 
tent, They all were highly developed in their social orientation and 
were affectionate and cooperative 

Alshuler and Hattwick developed a sheet, having twelve mam 
divisions, upon which the following twelve characteristics of a 
drawing were checked: general characteristics, mass, line, form, di- 
rection, spacing, size, color, techniques or manner of working, organ- 
ization, general effect, and content. 

Each of the divisions contains items to be rated or checked for 
intensity or frequency. The second through the ninth divisions refer 
to fairly objective judgments; the other divisions refer to more sub- 
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jective, but noneihcless impoUant latings. Thus the tentJi clivisjon, 
organ i/tiLion (p 251), includes (1) umelriied lines, loims, (2) oigan- 
ized lines, loims, (3) iocused on one object, (4) \aiict\ unielatcd to 
object, (5) paiiial synthesis, (0) puic picture, (7) experiment \\jrh 
themes, and (8) sucressi\c pictures to develop theme. 

The oiganj/aiioiial activities noted heie should be compaied v\ith 
those recoided ioi Roischach and Thematic Vppeicci^tioii Test 
interpretations The development o( age, sex, and othci norms is 
still a pressing need in this field 

HANDWRITING 

The interpretation ol peisonality tiaits hoin handvvnting called 
graphology, has a long hisror}, but it has had the misloitunc of being 
abused and exploitccl commeicially, and up to now little caicftil re- 
search has been lepoited Theic are two extieinc method') ol ap- 
proach, one, called a global appioach, attempts to build a jnctine of 
the whole peisonality by interrelating many complex estimates ol 
aspect') of handwriting without measuiing them caiefully, the otlier, 
called an aLoinisiir appioach, attempts to establish a relationship 
between a single sign, such as length ol stioke above the middle /one, 
and a paiticular personality tiait Ol couise, most workeis have used 
some combination of these approaches 

Typical ol the global appioach is a leport bv Caiitiil and Rand 
(1934), who located six individuals who showcel high scoics in one 
and onJ) one ol five parts ol the Allport-Vernon TeNts ol Values: 
aesthetic, economic, theoretical, political and religious (Cliapier 
XXI) Lach of these persons then copied the same letter on uiiilonn 
paper and signed a fictitious name When the six letters w'ere photo- 
stated and submitted to twenty-four graphologists, seventeen ol 
them indicated coiiectly the mam iritcicsts ol four oi more ol the 
writers, an occurrence that would theoretically happen by piiic 
chance about once in a million times. When the six letters w’ere sub- 
mitted to tw^enty-six educated adults who had no knowledge of 
graphology, no aduU succeeded in naming the mam iiiteiests of as 
many as tour students coiiectly 

Typical of a more atomistic approach is the work ol Klages (1919), 
who is sometimes referred to as the “lathci oi modem giaphology ” 
He desciibecl sixteen aspects ol handwuting which have been used 
and elaboiatcd upon by many other workers (Tllus 167) 

Thea S Lewmson and f Zubin (1942) and Geral R Pascal (1944) 
have well elaboiated Klagei variables h\ using piecise sampling, 
and measuring wiiitcn woids vsith a millimeier lule 'Iliey used a 
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magnifying glass to discover differences of as little as .2 millimeters, 
and made several hundred measurements of each sample. 

Lewinson and Zubin (1942) define four mam components (Ulus 
167). Each of these writing elements is analyzed and rated on a 
7-point scale, ranging from -f-3, extreme contraction, to — 3, extreme 
release. For instance, element g, height of middle zone, ranges from 
% mm. for exti'eme contraction to 6 mm. for extreme release, and 
element a, ornamentation, ranges from contraction into queer and 
distorted forms, through conventional, neglected, equivocal, and 
symbolic to decadent forms. 

Lewinson and Zubin postulate that if a person's handwriting is 
well balanced between contraction and release his personality is also 
likely to be well balanced. If one exhibits much contraction in 
writing, he is likely to be too hemmed m by compulsions and ra- 

ILLUS 167 COMPONENTS OF HANDWRITING 
I Foim and Shake* related to ability to plan and integrate complex patterns 
fl. Ornamentation — ^simplihcation 

Contraction — amplification of contour 
c Contraction — amplification of connecting form 
d Width of stroke 
e Bordei of the stroke 

f. Curvatuie of stroke 

II, Vertical Component related to intellectual, ego, and instinctive balance. 

g. Height of middle zone (exptession of self-feehng) 

h' Height of lower zone (expression of instinctive aspects) 
h*' Height of upper zone (expression of intellectual aggression, physical sphere) 
t. Direction of the line (enterprising, balanced, gloomy moods) 
k. Amplitude of fluctuation of the line 
I Contraction in fluctuation of line 
m Space between lines 

III Horizontal Component related to spontaneity and receptivity, self-confidence 

n. Space between letters 

o. Breadth of letters 

p. Direction of slant 

q. Parallelism of downstroke 

r. Left-right tendency 

s. Distance between ivords 
f. Width of left margin 
t". Width of right margin 

IV. Depth Component related to available energy, vitality, and its control, 
ti. Degree of pressure 
V Control of pressure 

w. Cursiveness of writing, degree of connection 

(Reprinted from Tliea Stem Lewinson and Joseph Zubin, Handwriting 
Analysis Copyright 1942 by Columbia Unnemty Press) 
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tional controls. Conversely, great release in writing, they believe, 
often concurs with impulsive behavior. The regions where contrac- 
tions or release are most noticeable are detected by an analytical 
record sheet and a table ol norms. The provision of 20 pages of 
sample scoring makes this work a definitive text. 

Pascal (1944) used in his study twenty-two male college graduates 
who had been studied carefully by the staff of the Harvard Psycho- 
logical Clinic. Thirty-six personality variables were measured by the 
TAT (Chapter XVIII), and thirty-nine variables were secured 
from samples of spontaneous handwriting, for which each man used 
his favorite fountain pen The twenty-two men's scores were changed 
to rank-order for the group, Rho coefficients were computed and 
changed to Pearson correlation coefficients. The results show that 
ten handwriting variables bear a significant relationship to five per- 
sonality variables The following correlations are all significant at 
the 5-per-cent level, and the larger correlations at the one-per-cent 
level* 


1 Play — avoid serious tension, iclax. 

with mean upper proja tion divided by mean mid-zone ht^ight fiO 

with total vertical expanse 51 

with mean distance between words 60 

with lower-zone fullness, mean lower-loop width divided by 

mean lower-projcction height — 15 

multiple regression .78 

2 Projectivity — tendency to project one’s anxiety, evoked beliefs, 
mild delusions of sell-refcreiice, 

with mid-zone ratio width divided by height .50 

with lower-zone fullness 5*^ 

multiple regression .78 

3. Dominance and defendence vs abasement; 

with mid-zone ratio .42 

with balanced projections Uppoi projections minus low'Ci pro- 
jections 59 

multiple regression .68 

4. Infavoidance — to avoid failure, ridicule, or shame; 

with width of stroke of pen witlioiii pressure 46 

with angularity 42 

multiple regression .62 

5 Nurturance — to aid and protect a helpless object, 

with primary width of Ictiers ni, v, and u 48 

with width of stroke ol ])cn w ithout jiressure — 45 

multiple regression .66 
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Pascal concludes tliat for his twenty-two male subjects certain aspects 
of handwriting are significantly related to measured aspects of per- 
sonality. Because of the small sample he refrains from specifying 
these variables, but an inspection of his results shows that play and 
relaxation go with relatively tall upper projections of letters and 
with distance between words, while projectivity and anxiety are 
related to relatively narrow letters in the mid-zone and relatively 
narrow loops on g, j, q, and y. Dominance was found to be higher 
among those having upper projections longer than lower projec- 
tions, and the reverse was true for abasement These findings are in 
line with those reported from the Bender-Gestalt and other drawing 
tests in which there is a tendency for the extent of vertical distances 
to be related to ascendancy and submission, and for cramped or 
narrow drawing to be related to anxiety 
Another study by Pascal (1943) reported the use of a kymograph 
to measure pressure in handwriting. He found fairly constant pres- 
sure (5.4 grams) for one subject’s normal writing on ten occasions 
during a 2-week period, but the same subject varied from 7.8 to 2.1 
when asked to use a heavy or a light touch. Pascal found it necessary 
to use a standard pen, for the type of instrument used influenced the 
pressure scores considerably. Among his results are a correlation of 
.69 between average pressure and range of pressure and of .30 be- 
tvreen average pressure and speed. To indicate the significance of 
pressure, he secured ratings of twenty-one men by seven psychologists 
who had known them about a year. The five traits for which they 
were rated were energy, expressiveness, impulsiveness, dominance, 
and determination. A correlation of .54 was found between average 
pressure and energy, of 60 between pressure range and energy, and 
of 63 in a multiple correlation between energy and average pressure 
and pressure range combined The correlations between pressure and 
detennination and pressure and dominance were much lower — from 
.01 to .33 respectively. These results show m general that an im- 
portant aspect of writing can be measured by this sort of apparatus. 
Considerable research is needed to devise methods for appraising 
pressure from ordinary samples of handwriting 
A number of studies have reported results of relatively untrained 
observers using rough inspection ox global approaches To the 
trained graphologist these results are analogous to a layman’s inter- 
pretation of an electrocardiogram or of a complex mathematical 
formula when the interpreter does not know what the symbols stand 
for However, the results are usually much better than chance, hence 
are worth considering Graphologists usually insist upon knowing 
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the sex, age, and handedness of a subject before attempting an inter- 
pretation. 

Typical of a blind analysis is the report of Castelnuovo-Tedesco 
(1948), who secured two samples of handwriting from each subject; 
first, a direct copy of a 73-word mimeographed newspaper report; and 
second, a spontaneous sample of what each subject could remember 
of this report immediately after the sample and his original copy had 
been removed. For the “copy*' test the subjects were fifty men and 
fifty women those IQ's ranged from 68 to 132. Those with IQ's below 
82 were excluded from the “spontaneous" writing, hence forty-four 
men and thirty-eight women were included. The spontaneous writing 
gave clues to vocabulary, grammar, length, punctuation, and style 
as well as to the handwriting itself. 

A group of six judges were used, three men and three women, only 
one of whom had had training in graphology, two had Doctor's de- 
grees in French, two were graduate students, and two were under- 
graduate students in literature or arts. All had marked aesthetic 
interests and interest in the experiment. The judges rated each speci- 
men independently on a 5-point scale based on total impression, on 
the six traits shown in Illus. 168. Each trait was rated at a separate 

ILLUS 158 HANDWRITING AND PERSONALnY; CON I INCENCY 
COErFICIJiN I S BFTWTEN MhASURl S \ND RATINGS OF 
SIX JUDGES, AFTER IRMMNG 


Variable 

Copy 

Spontaneous 

IntclligriKc 

G4 • 

51 

OnginaliLv 

60 

.56 

Anxiety 

11 

51 

Coinpulsivcncss 

32 

40 

Masculinit) 

33 

.41 

Pln&ical sex 

71% collect 

76% coircct 


• All figures aic significant at the 1 per rent lc\cl 

(Ananged from Tal)lcs S and 4, p 207, CrFneftc Psydiology Monogtafyk No S7, 
b> permission of Pclci Casiclnuo\o-'l edcsco and the cditoi of Genetic Psydiology 

Monografih^) 


session. In orclci to provide criteria for these traits, scores were ob- 
tained from \arioiis tests To piovide IQ's the Vcibal \\''cchsler- 
Bclleviic wab used with college groups, the full ^V^-B Scale with 
prisoners, and a Staulord-Biaet was given loi nine patients in a 
state home 

This varietv of IQ testing procedure probablv lowcis any cor- 
lelation wurh the results Intelligence quotients were also taken as 
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the best available criterion of originality. The other traits were 
evaluated by multiple-choice Rorschach scores where anxiety was 
indicated by poor forai, color dominance, and outright rejection of 
the card; compulsiveness by good form and small detail responses; 
and masculinity-femininity by responses to Cards IV, VI, and VII, 
which were considered to be characteristic of one sex and not of the 
other From Illus. 168 it appeal's that these judges were able to esti- 
mate all of these traits by inspection of samples of handwriting to a 
substantial degree. The differences between estimates from **copy” 
work, and from “spontaneous” recall and writing were generally 
small. In this connection it should be recalled that the “spontaneous” 
group eliminated eighteen of the subjects with low IQ*s from the 
“copy” gioup. The “spontaneous” samples yielded somewhat better 
coefficients for anxiety, compulsiveness, masculinity, and physical 
sex than the “copy” samples. 

A fascinating field of research lies here which seems to offer the 
advantages of objective measurement of commonly available mate- 
rial. 


STUDY GUIDE QUESTIONS 

1, What is included m a visual-motor gestalt? 

2. What patterns are included m the Bender-Gestalt Test? How are the 
results interpieted? 

5 What stages of maturation did Bender describe? 

4 What sorts of clinical findings does the Bender-Gestalt Test supply? 

5 How is the Mira Motor Test administered and scored? 

6 W^’hat patterns were found in the Mosaic Test? 

7 How did Machover administer the drawing test and the inquiry 
period? 

8 How does Elkisdi determine four factors of personality from drawings? 

9 What are the chief advantages of finger painting? 

10 How may preference for different media be an indication of person- 
ality? 

11 Make a list of characteristics of drawings and check each medium 
against it 

12. Make a list of character traits related to characteristics of drawings. 
Indicate some needs for more research. 

13 What are the main variables in handwriting? 

14. How may these variables be interpreted to show contraction and re- 
lease^ 

15 Which personal traits did Pascal find were most related to handwrit- 
ing characteristics? 

16. What controls are necessary to develop more insight into the sig- 
nificance of handwriting variables? 



CHAPTER XVITI 


STORIES AND FANTASIES 




In this chapter the appreciation and procliiclion of stones and poems 
are considcicd fioni two ^ude]> difTereni points ot view — the class- 
room and the cliiuc. In the classroom much emphasis is placed on 
style and little on symbolic meaning or methods of piodiution 
while in the clinic st) Ic is usually a minor consider Lion, and symbolic 
content and incthocl are anal)/ed in great detail and imeipieted ac- 
cording to various theories The tests desenbed heie usualh stress 
content and yield results somewhat like those desenbed in Chapter 
XIX, but the latter stress perceptual organi/ation moie than con- 
tent. 


LITERARY APPRECIATION 

Altliough lilciarnrc is one of the oldest of the arts, attempts to 
construct Stan daid measuiing mstrumenis in this field aie among the 
most recent Ihcse instruments arc usually classified as tests oi style, 
appreciation, and compos i Lion, 

Literary Style 

A numbci of language elements are basic to literary experiences 
but probably have little, li any, relationship to artistic appreciauoii 
Sucli elements of language include recognition of sjjokcn or wiiticn 
words and knowledge ot the incaniiig of granimaucal forms and 
punctuation. 'J csts ol disci imination ol these language elements ha^e 
been described in Gha]3tei VII 

Another group of rests, designed to measure disciiiuinatiori ot 
literary style, arc illustrated by two tests Irom the Univcisity oi 
Chicago, rcpoiicd by Sialnakci (1035) Jn one ol the tests, five dil- 
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ferent versions of a stanza are presented, and the students are asked 
to mark one as the best and each of the others as poor for one, and 
only one, of the following reasons: too sentimental, lack of imagery, 
faulty rhythm, inappropriate imagery, and poor diction. The version 
which was most frequently (25 per cent) chosen as best by five hun- 
dred scholarship applicants but which was considered by the judges 
to be poor because of inappropriate imagery and poor diction was. 

Steeped in dust sleeps a pallid lady; 

Thin and tall and blond was she; 

She*d the palest hair and the bluest eyes 
That ever were seen m this hot country 
But loveliness dies and blue eyes slumber. 

And the tomb is dark as dark can be; 

And when I decay, who will remember 
This lady of the West Country? 

The selection which was chosen next most frequently (24 per cent) 
as the best but which the judges considered to be poor because of 
sentimentality was 

Here slumbers a wonderful mother 

Bereft and sad her children three 

They sadly mourn for their wonderful mother 

Who taught them all to pray at her knee. 

But mothers, alas, are a gift to heaven 
We all must make, whoe’er we be; 

And now she lives enshrined forever 
In the heart of hearts of her children three. 

The following selection which was considered best by the judges 
was selected by 23 per cent of the students, but 30 per cent of them 
felt that it lacked imagery* 

Here lies a most beautiful lady: 

Light of step and heart was she, 

I think she was the most beautiful lady 
That ever was in the West Country. 

But beauty vanishes; beauty passes; 

How rare — rare it be* 

Five per cent of the students got perfect scores and 20 per cent, 
zero The mean was 1 7, which is only slightly above chance success. 

In another type of test described by Stalnaker (1935), the students 
were asked to match seven prose passages with seven short descrip- 
tions of the prose passages. Two of the descriptions are given below 
together with their passages: 
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Drsciipiions 

and acceptance of the ortJiodo\ have nc^er been Itis 'i\cakness 
HeLero(h)\y has ah\.i>s been a C|u.ilily oi the M^orous intellect and his 
mind IS electrically (hai|:;ed Thus, ^vheii sxiih nippiness and assuiance lie 
tells us V. licit he thinks ^\hai relationships he has discos ered betxxeen the 
old aiul the new, \\c are leadx to accept his piecise and confidem mcws He 
wiitcs with simpliciis and clanty, with the himness ol tone that oni‘ expects 
from one whose sjuiit is fnin and unyielding 

He has cunoiisly mingled simplicity anti goigcoiisness all his own His 
delight in expicssion coinmunicaLes to the leadei delight in what he ex- 
presses He concedes them only by the desire to reiulci his thought clear and 
concrete No onc' can, like Inin pile up splendor ol descri])iion, exotic rich- 
ness ol phrascologx, color, tones instinct with niusic, and then tuin in an 
instant to a sobei, solemn, stately siniplicits direct and appealing like the 
call o[ a herald He makes lile a procession to the gra\e but crowns it w'lili 
gai lands 

Pawage^ 

\\ hen I was a child and was told that our dog and our p.inot, wdth wdioin 
I was on iiitirnalo terms were not creatures like inysell, but were brutal 
wdiilsi I was reasonable I not only did not beliese it, but quite consciously 
and nitellectiiallv formed the oprnioii that the distinction was false, so that 
ajierwards, when Danvin’s views were first unloldcd to me, I jiroinidy said 
that I had found out all that foi msself before 1 wms ten years old, and I am 
far from sure that rny youthful arrogance was not justified, for tins sense 
of the kinship of all forms of life is all that is needed to make ]i\oIulic)n not 
only a concei\able theor\ but an insjiirmg one St \nthoii\ was njje for the 
£\oliuion Theoiy when he preached to the fishes, and St Francis when he 
called the birds his little brother Our vanity had led us to insist on 
God offering us sioccial terms by placing us apart from and above all the 
rest of his (real tires 

Bui as w'heii the sun apjDroaches towards (he gates of the morning, he 
first opens a little eye of heaxen, and sends away the spirits of darkness, and 
gi\cs light to a cock, and calls np the lark to matins, and by and by gilds 
the fiinges of a cloud, and peeps o\er the eastern hills thrusling out his 
golden horns like those which cleckc'cl the brows ol *Moses when he was 
forced to wear a veil, hec arise himself had seen the face of God, and still 
while a man tells the story the Sun gets up higher, till he shows a fair face 
and a full light, and then he shines one whole clay, under a cloud often, 
and soinctiinc's wecjnng great and little showens, and sets cjuickly, so is 
man’s reason and his hie 

Only 4 per rent of five hunched sludciits oi English who were 
scholarship applicants receixed peilecr scores and 21 pei cent re- 
ceived a score oi 7 ero T he mean score was 2 .‘3 correctly matched out 
of 7. 



498 


DYNAMIC PATTERNS 


In an attempt to substitute actual discrimination for verbal va- 
garies, Cannon (1937) selected two short passages of approximately 
one hundred and fifty words, from each of the following authors 
Joseph Addison, Ernest Hemingway, Francis Bacon, Charles Lamb, 
R. L. Stevenson, H. L. Mencken, Samuel Pepys, John Lyly, Jonathan 
Swift, and Lytton Strachey. Students were presented with the twenty 
passages typed in random order and asked to indicate the two selec- 
tions by each author. (The authors* names were not given ) Among 
51 college students, 37 succeeded in matching the passages from 
Bacon, 28 from Pepys, 26 from Lamb, and 26 from Lyly. Only 7 stu- 
dents matched the two paragraphs from Stevenson, 13 from Strachey, 
16 from Sivift, and 16 from Mencken. Stevenson’s work was more 
frequently matched with Hemingway’s than with his own. The aver- 
age number of correct matches was 4 09. 

Such discrimination-of-style tests seem to have been applied prin- 
cipally m college and to have proved extremely difficult even there. 
The technique is simple enough, however, and the emphasis on first- 
hand discrimination is good both in testing and in training 

Literary Information 

Both disci imination and appreciation of literary work are prob- 
ably to a large extent based on memory of the literature which one 
has read. A widely read person with a good memory will almost al- 
ways make finer discriminations than one with equal ability who 
has not enjoyed the same experience. Furthermore, although a cer- 
tain literary work may not be associated with a particular individ- 
ual’s pleasure, still a wide knowledge of literature will certainly affect 
his feelings toward various masterpieces From these considerations 
it follows that discrimination, appreciation, and information in the 
field of literature are all intimately connected Quite a number of 
tests about authors, literary characters, and characteristics of litera- 
ture are available Typical of these are the Jordan and Van Wagenen 
(1933) Scales of Attainment in Literature for the seventh through 
twelfth grades, and the Cooperative Literary Acquaintance Test of 
Beers and Paterson (1933) for secondary schools and colleges. Both 
are composed mainly of 5-choice items which deal with books usually 
assigned in literature study courses 

The first yields a total literary-age score similar to a mental age 
and also separate scores for evaluating the emphasis in learning, as 
follows: 

1. Information about literature, shown on such items as: 

The poem Miles Standtsh is about: (1) a shipwreck, (2) an exploring 
party, (3) an Indian attack, (4) a courtship, (5) scattering a settlement 
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2, Information about authors. 

The Lady of the Lake was written by: (1) Shakespeare, (2) Scott, (3) 
Dickens, (4) Cooper, (5) Stevenson. 

3, Outcomes of situations: 

In Snowbound, the family spent their evening in* (1) listening to 
stories, (2) dancing, (3) reading books, (4) playing cards, (5) reading news- 
papers 

4 General impressions and characteis 

Holmes* poem How the Old Horse TT^on the Bet is: (1) joyful, (2) in- 
spiring, (3) sad, (4) mysterious, (5) humorous. 

The Cooperative Literary Acquaintance Test yields a total score 
and also separate scores for. {a) Pre-Renaissance and Foreign, (&) 
English and American from 1500 to 1900, and (c) modern English and 
American. It consists of 12 pages which can usually be answered in 
about 40 minutes 

Tests of this sort are by far the most widely used in the field of 
literature. Research is needed to show the interrelationships of abili- 
ties to discriminate, appreciate, remember, and compose literary work. 

Sounds and Poetry 

In the measurement of lesponses to oral presentations there ap- 
pear to be two somewhat independent aspects — sound and moaning 
The sound stimuli include vaiious consonant, vowel, pitch, nmbic, 
and rhythm combinations. Memories of sound stimuli may also be 
important in individual silent reading The meaning of a selection 
involves the lecogniiion of simple words and literary leferences, and 
also judgments ol their appiopiiatcncss m a total pattern I’he fol- 
lowing samples are representative ol appreciation tests 

Sounds The problem ol dcteimining the relative pleasantness 
of speech sounds was approached by Roblee and Washburn (1912), 
who read aloud a list ol nonsense syllables which consisted ol one 
vowel followed by one consonant The listeners rated each syllable 
on a 7-point scale of pleasantness The average ratings showed the 
following 01 del of prcleience for vowels 


1 a as in father 

2 e? as in gel 

3 oasmii'iotff 

4 a as in Jnte 

^ r 1 as in xvrile 

as m hat 


r 00 as m boot 
7 as m hid 
r o as in hot 

7 J ffc as in feet 

[ aw as m bawl 

8 07 as in boil 

9 7 / as m rnud 


The differences between the best and w'orst \rere so small and the 
range for each votvel so laige that there seemed to be no marked 
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agreement Consonants showed slightly greater average differences 
in the following order of pleasantness: 1, m, n, v, th, s, z, p, d, b, zh, t, 
sh, k, and g Givler (1915) found similar results from experiments in 
which vowels and consonants were combined in various ways, and 
reported that the explosive consonants and the shorter vowels were 
judged to be less agreeable than the others. No standard test using 
such judgments seems to have been made 

Downey (1927) asked college students to record their reactions to 
single words In evaluating the responses she used the following 
classification similar to that which Bullough (1910) used with colors. 
(A few examples from her observers are also given ) 

1. Objective responses to the sound or appearance of a word Murmur — 
the sound is the meaning 

2 Associative responses Drowsy — ^sleepy, or Lily — ^visualization of one 
fall lily m a bare space 

3 Physical responses within the subject: Pendulous — feeling of being sus- 
pended 

4. Symbolic, an arbitrary association: Melancholy — a green and purple 
word 

5 Personalized. The word is treated as a person Twilight — ^word looks 
half asleep 

Downey found no marked tendency, as Bullough had, for a person 
to use mainly one type of response. The three subjects who made the 
largest number of symbolized and personalized responses were those 
chiefly interested in literature and visual art, but the most “literary” 
person made many objective responses. No clear relation between 
type of response and literary ability was demonstrated, but the test 
needs to be refined and enlarged to give it reliability and internal con- 
sistency. This method may possibly be made to appraise important 
types of responses better than other methods 

Prose and Poetry A number of tests have appeared which ask a 
person to indicate relative literary merit among a series of short 
samples In none of these is any definition of literary merit provided, 
hence emotional appeal, content, clarity, and style are probably ef- 
fective in unknown amounts In view of the difficulty of analyzing 
these elements, however, the standard tests doubtless represent a 
marked advance over cruder methods of rating literary appreciation 
of individuals. 

Three test forms were prepared by Carroll (1935), one for use in 
junior high school, another m senior high school, and a third in col- 
lege Each form consists of a 16-page booklet. On each page four 
short paragraphs of approximately one hundred words each are 
printed. The paragraphs on each page are all concerned with the 
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same topic Thus, in the college test the topics are a man, an interior, 
a sunset, a fire, a tryst, spring, a woman, homecoming, wind, literary 
criticism, twilight, remarks to a son, delirium, and sunrise. The para- 
graphs on each page were selected to represent four levels of literary 
excellence on the basis of reputation of the authors and judgments of 
experts (Ulus, 169) No elaborate analyses of judgments were made 
to insure equally often-noticed steps between the levels of literary 
excellence, but the ranking of the four samples by judges showed 
marked agreement in making first and last choices, and fairly marked 
agreement in selecting second and third choices. The student is asked 

ILLUS 169 SAMPLES OF A PROSE APPRECIATION TEST 
An Interior • 

A 

I went with the little maid into a gorgeously decorated bedroom, all of cream 
color and light blue that blended prettily. The bed was a great, wide affair of 
beautifully carved and ornamented wood, painted creamy white with blue and 
gold Liimmings Tlicie was a wondcifiil huicaii and a dicssing table to matcli, 
and in one coriici of the room a miiror that went from flooi to ceiling 1 had to 
hold my bicath 

B 

Lollie had never seen such a pictt\ room, and it made her gasp to sec liow pietty 
the fuiniture was, as well as how pretty the nigs wcie, and the cm tains at the win- 
dows and the pictiiies on the wall, but what she really liked best was that funntiiie. 
for It looked coiiifoi table as well as pietty, and she knew' it must have co'.t hundieds 
and hundieds of dollars She wished she could live and die iii that one room, it was 
so pietty. 

C 

An air of Snhhath liad descended on tiie room The sun shone biighth tliiotigh 
tlie window', spieacliiig a golden lustic over the white walls, onl) along the iioitli 
wall, wlieie the bed stood, a half shadow Imgciccl . 1 he table had been spiead 
with a v\hile cover, upon it lay the open hymn book, wiih the page turned down 
Beside the hvmn book stood a bowl oL watei, beside that lay a piece of white 
(loth kjcisti was tending the sto^e, piling llie wood in diligently . 
Soiiiic sat in the comer, ciooning over a tins bundle, out of the buiullc at intci- 
VflK came faint, whee/y chirrups, like the sounds that use fiom a nest of young 
birds 

D 

Ma)()r Piime bad the west sitting-ioom It was lined with low' bookcases, full 
of old, old books Thcie was a fiieplace, a winged cbaii, a bioad toucli, a big desk 
of daik seasoned mahogany, and o\ei the mantel a steel ciigiaviiig of RubeiL T 
I,ce I be low windows at the back looked out upon the wooJecl gieeii of the 
ascending lull, at the fioiil was a poich which g.i\c a view of the \ alley 

* Ki V J'list, C, sccoinl, 1), ihiul, A, fouith, B 

(Carioll, I9J5, p 3 Bv peimission of the Pclucational Test Bureau, 
Minneapolis, Minn) 
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to rank the four selections on each page in the order of his estimate 
of its “literary merit.” The pages are arranged in order of ease of dis- 
crimination ranging from the easiest, which was correctly ranked by 
approximately 60 per cent of a gi'oup, to the most difficult, which 
was ranked correctly by only 30 per cent. 

Both split-half reliability and retest reliability for groups of three 
hundred students were approximately 70 In scoring 2 points were 
allowed for each correct ranking of the best and worst selections and 
one point each for the second and tliird selections. The total score 
for each page was therefore 6, and for the whole test, 84. The mean 
scores for college freshmen were 45.9, for sophomores, 47.0, lor jun- 
iors, 49.8, and for seniors, 52 8. Gentiles are furnished tor each class in 
college. Grade norms in junior and senior high schools are also pro- 
vided for their respective forms. 

A test designed to measure poetry appreciation was published by 
Rigg (1937). Forty short selections by recognized poets were paired 
with forty other inferior selections on the basis of similarity of central 
thought. The task was to choose the best selection in each pair. Two 
equivalent foims were found to correlate .815 among 342 college 
students. 

A multiple-choice test of literary appreciation was described by 
Fox (1938). The test requires that omissions in various passages of 
prose and poetry by famous authors be filled in by choosing one of 
four alternatives. It was found that graduates in literary subjects 
chose the words written by the author more frequently than did 
graduates in nonliterary subjects. From introspections Fox concluded 
that the students of literature employed a “literary feeling” and the 
others tried to exercise “critical judgment.” 

STORY PRODUCTION AND WORD ASSOCIATIONS 
Analyses of Fantasies 

Fantasies are stories which are normally recognized by the teller as 
make-believe. They may express a large variety of themes, some of 
which are wishes of the author. Some wishes are deeply disguised, 
some quite obvious Much fantasy is expressed in painting, folk 
lore, drama, and dancing. Clinical experience has often shown that 
analyses of fantasy reveal an individual’s needs, his method of meet- 
ing them, and his attitudes toward himself Fantasy analysis may 
show that a successful person may be making a fine record as a com- 
pensation for feelings of inferiority or a physical disability. Clinicians 
believe that a free expression of fantasy should generally be en- 
couraged, since it relieves tensions and is usually less harmful than 
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repression Fantasies of death or anxiety often gi\c considerable le- 
lief and ie\eal causes of emotional disturbances which must be re- 
moved il one IS to do his best Clinicians and psychiatusts use fantasy 
analyses as one ol the principal means ol discoveimg both super- 
ficial and deeper needs 

Thrrnntic Al?peueptio7i Test (TAT) This test was developed 
over a pciiod ol yeais at the Haivaid Psychological Clinic b> Dr. 
Heniy A Muiray and his associates (1013) \ senes of twenty some- 
what ambiguous pictuies arc picscnted to a person and he is en- 
couiaged to tell sj^ontancous stones about them 1 he pictuies are cf- 
fecme in three ua)s they siiimilate fanrilul imagination and make 
the subject less sell-conscious, encoinnge the subject to icact to cer- 
tain common conflict situations, allow more systematic and com- 
parable appraisals of individuals than might be made without stand- 
ard jDictures 

The "f A'r were standaidi/cd on groups of persons between four- 
teen and forty ycais of age Of the 1913 set ol thiuy-one pictures, 
eleven aic suitable tor both sexes, seven aie used only lor boys and 
men, seven foi gnls and women, and one each for boy\s, gnls, iricn, 
women, boys and girls, and adults Illustration 170 is a sample liom 
this series 

The pictuies were selected from a larger scries according to the 
amount of information that each contributed to the final diagnosis 
The most revealing pictures weie usually lound to be those v\hich con- 
tained a representation of a person of about the same age and sex 
as the subject 

Figiues are included vvli*ch mav represent father, moiher, siblings, 
and mai iial partners in various situations of acceptance or rejection 
The lust ten pictures are of usual situations; the last ten aie vague, 
dramatic, or bi/aire, and there is one vshite blank space The first ten 
are to be shown during one hour on one dav, and the second ten 
during one hour on the next day oi several clays later 

Congenial surroundings and a sympathetic examiner aie essential 
m administering this test, because good results depend upon creativ- 
ity. Even in good circumstance^ about one third of the stories will 
usually not contain personal elements. The subject is seated com- 
fortably 111 a chair, usually with his back to the examiner, and one 
of two sets of instructions is lead at tlie first session (Murray, 1943, 
p.3). 

Form A (suitable for adolescents and for adults of average intelligence 
and sophistication) 

This IS a lest of imagination, one form of intelhgenrc I am going to show 
you some pictuies, one at a time, and your task will be to make up as 
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ILLUS. 170. THEMATIC APPERCEPTION TEST NO. I2F 



(By permission of Dr. H. A. Murray and the Harvard University Press.) 


dramatic a story as you can for each. Tell what has led up to the event shown 
in the picture, describe what is happening at the moment, what the char- 
adlers are feeling and thinking; and then give the outcome. Speak your 
thoughts as they come to your mind. Do you understand? Since you have 50 
minutes for ten pictures, you can devote about 5 minutes to each story. 
Here is the first picture. 

Form B (suitable for children, for adults of little education or intelligence, 
and for psychotics). 

This is a story-telling test. I have some pictures here that I am going to 
show you, and for each picture I want you to make up a story. Tell what 
has happened before and what is happening now. Say what the people 
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Iceling and thinking and Iiow it %m 11 cf)inc out You can make up an> kind 
of story \ou please Do you undcistancP Well then, lieie is ihc first pictine 
You have 5 minuics to make up a story See lio^\ ivell you (.in do 

After finishing die first stoiy the subject is coininended (if ihcte is any 
ground (ot ii), and then reminded ol the instiuctions (unless lie has obeyed 
them laithlullv) Foi example, the e\amiiiei miglu sa\ 

“lhat was (ciiaiiily an inteic^ting stoiy, bur vou forgot to say how the 
boy lichased when his mother ciitici/ed him and you left the naiiative 
hanging in the air I hcie was no real outcome You sjient STj iniiuites on 
that story Yoiii oihers can be a little longei Now see how well you can do 
with the second picture.” 

Young children people of other culiiiics, and psychotics oficn need a 
good deal of eiKoinagement hefoie they will speak litely In administering 
the lest lo e\ticineiy leticeiii children it is permissible to oiler lewaids The 
evaminer may s«iy. I’ll give you <i piesent li you tell me some nice long 
stories today” or, “If you do well now I’ll tell you a sery t‘\citmg stoi\ wdieii 
you’re through”, or, ‘Theic's a prize for the one who tells the best stones.” 

Vt the second session either Foiin A or Form B instiuctions are 
given as follows 

Form A 'Ihe proccduic today is the same as belore, only this tune you 
can gi\e freer lein to your imagination Your first ten stones were excellent, 
hut you conhnccl yourself pretty nmch to the facts of eseiyday hie Now I 
would like to see wliat you lan do when you disicgaicl the commonplace 
realities and let your imagination ha\c its way, as in a my lb, (airy story, 
Ol allegorv Hcie is Picture No 1 

J'onfi B Today I am going to show you some inoie pictures It will be 
easier for you this time because the pictuic's I ha\e here aic much belter, 
more interesting You told me some fine stones the other day Xow^ T want 
to see w'hethcr sou can make up a few more Make them esen more exciting 
than you ditl last time li you can — like a dream or fairy tale Here is the 
first picture 

Blank Caul Card No Ifi is accompanied by a special instruction Ihe 
examiner sas s 

“See what you can see on this blank card Imagine some pictuic there and 
clesdibe it to me in detail ” 11 ilie sulrject does not succeed in doing this, 
the examiner says, “Close your eyes and picture something ” -Viter die sub- 
ject has given a lull description of Ins imagery, the examiner says, “Now 
tell me <i story about it ” 

The stones should be recorded stCMiographically, or by a sound 
iccoitlcr, or by detailed notes A subseciucnr interview is held im- 
mediately altei lire second session oi w'lthin a lew’ days, at w'hich the 
subject IS 111 get! to indicate the source ol each theme or incident in 
each story He is reminded of the plot ol each sroiy, li necessary. 

'I’hc instructions state that the interpreter ol the material 



506 


DYNAMIC PATTERNS 


. . , should have a background of clinical experience, observing, inter- 
viewing and testing patients of all sorts; and, if he is to get much below 
the surface, knowledge of psychoanalysis and some practice in translating 
the imagery of dreams and ordinary speech into elementary psychological 
components In addition he should have had months of training m the use 
of this specific test, much piacttce tn analyzing stories when it possible 
to check each conclusion against the known facts of thoroughly studied per- 
sonalities 

Before starting to interpret the stories, the examiner should know 
the sex and age of the subject and his siblings, his vocational and 
marital status, whether his parents are dead or separated, and other 
pertinent relationships* The following three steps are in Murray’s 
interpretation: 

a. First, one determines the character with whom the subject has 
identified himself (hero) and records the strength of his needs or 
drives. The needs described by Murray (1938, pp. 211-213) are as 
follows: 

I. Primary (viscerogenic) needs: 

1. n Air 

a) n Inspiration; 

h) n Expiration; 

2. n Water; 

3. n Food, 

4. n Sex, 

5. n Lactation; 

6. n Urination; 

7. n Defecation; 

8. n Harraavoidance (avoidance of physical pain); 

9. n Noxavoidance (avoidance of noxious substances); 

10. n Heatavoidance; 

11. n Coldavoidance; 

12. n Sentience (sensuous gratification). 

II. Secondary (psychogenic) needs. 

1. Actions associated with inanimate objects. 

a) n Acquisition (acquisitive attitude); 

b) n Conservance (conserving attitude); 

c) n Order (orderly attitude); 

d) n Retention (retentive attitude); 

e) n Construction (constructive attitude); 

2. Actions expressing ambition, will-to-power, desire for accomplish- 
ment and prestige* 

a) n Superiority (ambitious attitude); 

b) n Adiievement (achievant attitude); 

c) n Recognition (self-forwarding attitude); 

d) n Exhibition (exhibitionistk attitude), combined with n Recog- 
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nition m Explorations in Personality, and the opposite of n 
Seclusion; 

3. Desires and actions which defend the status or a\oid humiliation: 
a) n Inviolacy (inviolate attitude), divided into three needs* 

1) n Infavoidance (infavoidant attitude), to prevent humilia- 
tion; 

2) n Defendance (defensive attitude), 

3) n Counteraction (counteractive attitude), need to redeem 
the self after failure, etc., 

4 Needs having to do with human power exerted, resisted, or 
yielded to 

a) n Dominance (dominative attitude); 
h) n Deference (deferent attitude), 

c) n Similance (suggestible attitude); 

d) n Autonomy (autonomous attitude); 

e) n Contrarience (contrarient attitude), 

5 Sado-masochistic needs 

a) n Aggression (aggressive attitude); 

b) n Abasement (abasive attitude); 

6 71 Bhiitiavoidance (l)laina\oid<int attiliide), 

7 Needs legaidmg aflcction l)(‘t\\een people, 

a) n Afhliatioii (afniiaine attitude), 

b) n Rejection (re]ccii\c attitude), 

r) 71 Nurturance (nurrurant attitude), 
d) n Succor. nice (succoranr attitude) 

8 n Pla) (playful attitude), included in list with some hesitation, 

9 Need to ask and to tell 

a) 71 Cogni/.ince (inquiring allitndc), 

b) 71 Exposition (expositive attitude), 

10 Needs associated with cneig) 

rt) 7/ Xctivitv, 
h) n PassiMiyi 

Ihc strength of each drive in each stoiy is lated Irom 1 to 5, with 
1 representing a slight octuncncc and 5 a great uileni>ily, duration, 
frequency oi iinpcntame in the plot ALtei the tvventy stones have 
been scored, the latiiigs ior each vauablc are added and compared 
with norms lor the subject’s age and sex 

h Second, one determines the “piC'jS,” that is, the types and 
sfcngihs ol the environmental foic'cs which press upon the “heto” 
Particular attention is given to imaginary situations or persons not 
represemed in the picuiic, to the i\pc ol pci sons who exert the most 
press, and also to the absence ol beneficial associates 'Ihc pi esses as 
listed by Munay (1938, p. 291) aic as lollows 

iTrom Exploraliom in Pe7}iOnnlity by H \ Mm ray et al Cop\ light 1938 by 
Oxford Unneisiiv Ihess 
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1, p Family Insupport 

2 p Danger or Misfortune 

3. p Lack or Loss 

4. p Retention, Withholding Objects 

5. p Rejection, Unconcern and Scorn 

6 p Ri\al, Competing Contemporary 

7 p Birth of Sibling 

8. p Aggression 

9. Fp Aggression-Dominance, Punishment (fusion press) 

10. p Dominance, Coercion and Prohibition 

11. Fp Dominance-Nurturance 

12. p Nurturance, Indulgence 

13 p Succorance, Demands for Tenderness 

14 p Deference, Praise, Recognition 

15 p Affiliation, Friendships 

16 p Sex 

17. p Deception or Betrayal 

Intraorganic Press. 

18. p Illness 

19. p Operations 

20 p Inferiority 2 

c Third, the interaction of the hero and the environment is sum- 
marized to show typical patterns or “themas,*' which are abstracts of 
the dynamic patterns and their outcomes. These patterns show cer- 
tain facts, for example, to what extent the hero strives to make things 
happen or waits for things to happen, how much he helps others and 
they help him, whether he gets properly punished or let off, and the 
ratio of happy and unhappy endings (Illus. 172). These abstracts 
are made by taking each unusually high need of the hero and noting 
the press which supports or defeats it An inspection of these need- 
press combinations may yield significant over-all thematic patterns, 
which are regarded by Murray as good “leads” or hypotheses to be 
verified by other methods. 

Although Murray points out that the use of the TAT is not de- 
pendent upon any particular theory of personality, he distinguishes 
three layers of normal personality. 

a. The inner layer is composed of repressed unconscious needs 
rarely known or expressed in their crude form. These are usually 
expressed symbolically in the second TAT session, but cannot be 
known without careful psychoanalytical interpretation. 

b. The middle layer contains conscious needs which are known 
but not usually confessed. They are manifested in secret and when 

3 ^ Ibid 
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one is oflE guard. These are usually shown in either the first or the 
second TAT session. 

c. The outer layer consists of tendencies which are publicly ac- 
knowledged and openly manifested in behavior. Stories composed in 
the first TAT session are usually more closely related to the outer 
layer of personality. 

Murray points out that the TAT themas often reflect the exact op- 
posite of one’s obvious usual behavior. Strong but inhibited needs 
appear in the TAT findings Thus correlations of 40 or more be- 
tween TAT ratings and behavior ratings of the same traits of col- 
lege men were found for traits which have cultural sanction, for ex- 
ample, creation, dominance, exposition, nurturance, passivity, and 
dejection. For other traits without cultural sanctions, for example, 
sex behavior, the correlations were from — .33 to — .74. 

Additional TAT Studies. The TAT has stimulated a large num- 
ber of researches, some of which have applied the mateiials to various 
groups and others have proposed variations in administration and 
scoring. Only a few samples are mentioned here 

R. N Sanford (19-13) anal) zed clnldreirs stories in the extensive 
Harvard Growtii Study ol School Children and destiilDcd interrela- 
tions between personality, physique, and en\ iionmeiual conditions 
He combined tliose ihar correlated with each otliei in the neighbor- 
hood of .40 into sxndiomcs and ga\e tlicm naiiics which weie in- 
tended to dcscubc common elements Thus one s)ndiome called 
“orderly production” included creativiiv, endurance, and needs tor 
order, consti uction, and counteraction This syndrome cou elated 
only .16 with mental age but 70 with school giacles when age was 
held fairly constant \noiher sxndroiiie called “conscientious efloit” 
included deliberation, conjunctivity, endurance, and needs lor un- 
derstanding, consti uction, cou ntei action, blamavoidance, and older. 
The average correlation between pans of \ariables in the oiderly- 
production s)ndrome was 52, and in the coriscientious-cfTort syn- 
drome 58 He described twenty such ssiidiomcs which form an in- 
teresting basis ioi peisoriality study 

One example ol clear-cut results from the use ol the TA'l is le- 
ported by Klebaiiofl (1917) who compared alcoholics with noimal 
men. The essential findings are siimmari/ed in Ulus. 171, w'hich 
gives the mean j^eicentage ol occuiieiice ol all the major themaiic 
categories. The individual records showed a sti iking ainiilaiity to 
the group means as wa^ boine out b) inspection and by the stand- 
ard deviations presented in Ulus 171 I he alcoholics showed much 
greater emotional stress and failure ol the cential charactei, and 
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ILLUS 171, COMPARISON OF NORMAL AND ALCOHOLIC 
MALES ON THE TAT 

Percentage of Total Themas 



Alcoholic 


Normal 

Major Categories 

Mean 

SD 

Mean 

Physical Aggression 

21 

98 

33 

Nonphysical Aggression 

17 

79 

20 

Internal Emotional Stress 

48 

113 

25 

Miscellaneous Themas 

14 

70 

22 


Too 


100 

Successes and Failures' 




Failure. 




Central Character 

59 

15 8 

35 

Success: 




Central Character 

10 

41 

28 

Failure 




Minor Characters 

9 

41 

20 

Success: 




Minor Characters 

22 

125 

17 


100 


Too 

Areas of Failure, 




Economic Failure 

6 

62 


Social Failure 

37 

91 


Power Failure 

42 

89 


Love Failure 

15 

65 



"Too 




(After Klebanoff, 1917 By permission of the Editor of Journal of 
Consulting Psychology^) 


much less physical aggiession, and failure of minor characters than 
the normal group 

Illustration 171 also analyzes the failures among the alcoholics' 
themas of the central characters in terms of economic, social, power, 
and love inferiority. Failure at the power level predominates in the 
alcoholics and accounts for nearly half of all the failures of central 
characters. Of the 17 patients studied, 7 place greatest emphasis upon 
power failures, 6 are dominated by social failure, and the remain- 
ing 4 reveal an equality of social and power failure Among all 
patients power and social failures were more numerous than eco- 
nomic and love failures 

L. D, Eron (1948) points out that there is a serious lack of ade- 
quate normative data for the TAT, Many investigators have pub- 
lished characteristics as representative of clinical groups, but only 
a few of them have given frequencies substantiating these diagnostic 
cues. The norms seem to be the result of subjective impressions left 
after the examination of persons who have particular diagnoses. 
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Eron’s procedure 'vvas to follow Murray’s directions in presenting all 
twenty cards to adult males at the E[ai\aul Unuersity Clinic and to 
male patients who li*id been diagnosed as sc Jn/ophr ernes in the 
Ventuia Veterans Hospital A total ol one thousand stones con- 
taining 1,988 thenies was recorded ioi all subjects. The difference in 
the total numbei ot themes was not signdicant — 963 schi/ophicnic 
stories and 1 025 student voices With laie exceptions no themes 
appeared in one gioup which weic absent liom the other, and in 
those tew catcgoiies in which this did occur the ircqucncy ol the 
theme no inoie than one or two iNfoie than a luinthed conijraii- 
sons wTre made bctw'een the tw^o groups, but clifleieiiccs that weie 
significant at or beyond the 5-j)er-ceni level weie found in onlv 
thirteen themes Jt appeared that the greatest difference is among 
themes of moral stiugglc where students exceed the schjyophienics 
by a signihcarit number Ilowevei, such themes ol moral struggle 
have been given as one ol the diagnostic cues in schizophrenic pa- 
tients The othei themes likewise give rise to serious questions con- 
cerning the diagnostic validity ol the particular theme. 

Eron ha-s fuithei compared the most ficciuent thenies for college 
students and schi/oplu ernes and finds that the college students give 
significantly more imaginary and symbolic stories more themes of 
guilt, remorse, j^ressure from parents, and disequilibrium than do 
the schizojihieiiics The schizophienics give more themes of religion, 
retribution, pressure, illness, oi deaih of heterosexual partner, and 
succorance from the patents Eron finds, however, that the most 
significant results are the trait sinulai'ities between the two groups 
There aie no broad gioup differences Ai least some of the schizo- 
phrenics fall into every one ol the normal catcgoiies He feels, there- 
fore, that the icspon*>cs on ihe TAT cards are deteiminccl more 
by the actual stimulus cards than by the personality deviations of 
tire subjects 

Lastly, Eron lists, picture by piciuic, a tabulation of the most com- 
mon themes appealing in one thousand T.AT stories of 25 hospital- 
ized schizophrenics and of 25 noiihospitali/ed students (Ulus 172) 
Both groups aie compaiable in education, age, sex, TQ, veteran status, 
and marital status All are male veterans with an IQ of at least 100 
All the themes wcic related by at least ten subjects For sixteen ot the 
twenty cards the number ol themes show'ing disequilibrium is much 
greater than the niinihcr of themes shovsing equilibrium There 
were only forty-five instances of confusion or inclccision about the 
sex of the central character in one thousand stories, and a few more 
of these came from college students than fiom patients In the light 
of these results it is felt that a particular examiner must be cautious 
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ILLUS 172 PICTURE BY PICTURE TABULATION OF COMMON THEMES 

IN TAT STORIES 

[Included in this table are responses of 25 hospitalized schizophrenics and 25 non- 
hospitalized college students Both groups are comparable in terms of age, edu- 
cation, sex, IQ, veteran status, marital status. All are male veterans of ages 20 
to 34 years, with twehe to seventeen \ears* education, and an IQ of at least 100. 
Twelve are mairied and thirty-eight are single All themes related by at least 
ten subjects for each picture are included.] 

Fre- 

Theme Defimtion quency 

Picture I 

pressure from parent figures are prohibitive, compelling, censuring, 

parents punishing, quarreling with child 26 

aspiration dreaming of future, hoping for future, determination 26 

vacillation wasting time, putting off a distasteful task, procras- 
tination, loitering 12 

curiosity wondering or inquiring about construction of object, 

contents of room, etc 11 

inadequacy realization, whether justified or not, of lack of success 10 

belongingness desire expressed to be with or accepted by peers 10 

Picture II 

aspiration dreaming of future, hoping for future, determination 26 

economic compelled to or prohibited from, or limited in doing 


pressure something because of lack of money 21 

Picture III EM* 

pressure from parent figures are prohibitive, compelling, censuring, 

parents punishing, quarreling with child 13 

suicide attempted or completed, preoccupation with 12 

generalized environment is generally frustrating 12 


restriction 

Picture IV 
pressure from 
partner 
partner 
comforts 

Picture V 


pressure from 

parent is prohibitive, compelling, censuring, punish- 


parents 

ing, quarreling with child 

20 

cunosity 

wondering or inquiring about construction of object, 


Picture VI BM 

contents of room, etc 

12 

pressure from 

parent is prohibitive, compelling, censuring, punish- 


parent 

ing, quarreling with child 

21 

departure from 

child is taking leave of parental home 

18 


parent 

• The initials stand for B (Boy), M (Man), F (Female). 


partner is prohibitive, compelling, censuring, punish- 
ing or quarreling, etc. 20 

a positive relationship, sets at ease, conciliates, regales 18 
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lIXUS. 172. PICTURE BY PICTURE TABULATION OF COMMON THEMES 
IN TAT STORIES (Cant’d) 

Fre~ 

Theme Defiiiition quency 

filial obligation child feels it his duty to remain with, comply with, or 

support parents 13 

disappointment child does not live up to parent’s expectations 13 

to parent 

concern of parent parent is worried over physical or mental well-being 

of child 11 

aggression toward robbeiy, accident, murdei (of unspecified indiv id- 
environment ual) 10 

Picture vn BM 

succorance from child seeks or receives aid, advice, consolation, pro- 
parents tection from parent 31 

pressure from parent is prohibitive, compelling, censuring, punish- 

parents ing, quaiieling with child 11 

Picture VIIIBM 

aspiration di earning of futinr, hoping for futuie, determination 23 

aggression fiom wax, accident, nauuc, aiiiinal, disease 10 

impel sonal source 

Picture IX B\l 

retirement central chaiattcr asleep, resting, etc 41 

Picture X 

partner loiUent- sercnilv lu marilal life satisfaction wiili paitner, 
ment niai ual bliss, heterosexual bli^s 27 

Picture XI 

aggression fiom im- war, accident, natuie, animal, disease 22 

personal source 

aggression tow ai ds plivsical liaim infiieted oi intended foi ituliMckial of 
peer same sex and appiovimatcly same agc' — plivsical vio- 
lence between two animals 10 

Picture XII M 

religion pravei, seeking consolation fiom God, religious con- 
flict, leligious awakening Ij 

death or illness 

of son 14 

Picture XIII MF 

guilt-icinoise 22 

death or illness 

ofpaitiiei 18 

illiatscx extra- oi pic-inantal hereiosexual i elation, non- 

incestiious 18 

aggression toward phv^ical haim inflicted or intended foi heiCTOscxual 
partner pariner IG 
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ILLUS. 172. PICTURE BY PICTURE TABULATION OF COMMON THEMES 
IN TAT STORIES (Cont'd) 

Fre- 

Theme Definition quency 

Picture XIV 

aspiration dreaming of future, hope for future, determination 16 

tranquillity peace of mind, content with environment and own 

accomplishments 15 

Ptctwe XV 
death or illness 

of partner 17 

religion prayer, seeking consolation from God, religious con- 
flict, leligious awakening 16 

Picture XVI Since there is no individual category of suflicient fre- 

quency, only more general categories can be included 
intrapersonal disequilibrium 17 

interpersonal equilibrium 15 

impersonal disequilibrium 13 

intrapersonal equilibrium 12 

interpersonal disequilibrium 10 

Picture XVII BM 

self-esteem egocentnaty, self-confidence, self-respect, self-appro- 
bation 14 

Picture XVIII BM 

drunkenness 22 

succorance from seek or receive aid, advice, consolation, protection 
peer from peer 17 

pressure from peer fnends are prohibitive, compelling, censuring, pun- 
ishing, quarreling 15 

Picture XIX 

aggression from im- war, accident, nature, animal, disease 28 

personal source 

imaginary theme H 

level 

Picture XX 

vacillation wasting time, putting off a distasteful task, procrasti- 
nation, loitenng 23 

loneliness central character misses someone, is an outcast, 

friendless, homeless 18 

(From Eron (1948), page 393. By permission of Leonard D Eron and the editors 
of the Journal of Consulting Psychology ) 
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in using Lhe TAT as a diagnostic instiunient, and in applying the 
cues repoucd in die litoiaiiire bv \arious invcsfigatois. 

A laige numbci oC othei studies on tiie ajipluaLion of the TAT 
have repoi Led 

a. Variations in scoring and intcrpi elation 

b Group adniinistiaijon with responses to a check list of piepared 
stones 

c’ Compaiisons between TAT stores and vanations of age, sex, 
and (ondiLioiia ol adiiiini^i ration 

d, Compaiisons with dieams and aiitobiogiaplues 

e, Not ms for stuttercis, neiiiotics, the menially deficient, and othei 
cliiiital gioups 

f, Antj-Semiti^.in and attitudes lowaid labor as rellccted in sonic 
ol the TA r 1 csults 

Much more work is needed on analyzing the i csults ol lantasy tests 
and to give the interpretations clearei meanings One of the most 
rewarding uses of the TAT is not a mctriti/ed score or profile but a 
picture of the w'ay a person is leading to his drives and opportunities 

Symorids Pichne-Stoiy Tr^t Another set ol twenty pictures, all 
drawn by Lynd Ward, were issued by S)iiionds (11118) to be used to 
study adolescents It is not designed to yield a clinical diagnosis oi 
indicate learning status or potentialities It is a piocedurc used to 
study drives, conflicts, and methods ol dealing with diives, and 
should be of value in planning psychotherapy. There is evidence tliat 
the test has some therapeutic value ol its own It contains twenty 
pictures scjiaiated into Set A and Set B Set \ is to lie used at the 
first sitting and B on a later day. Set B nsuall) gives the more signifi- 
cant results The same pic tines are used loi both bo)s and girls, be- 
cause Svmonds found that sex difleieiues w^eie relatively insignificant 
m the iiiteipictatioii ol stones 

The test, which is to be admmisteied only alter good lappoit has 
been established, is mlioduced as a rest of cieative imagination 
About one hour is needed for each set ol pictures I ollowing Set B 
there should be a * peiiod ol association'’ in winch the examinei 
reads back the stoiy winle the siibjc^ct holds the coiiesponding 
picture Altei each stoiy is read the subject is asked where it Crtinc 
from, and the answers are recoidecl The analysis of lesnlis yields a 
series of hypotheses regaiding underlying motives, jjarent-child lela- 
tions, and inodes ol opciating I’he ircc|uenc\ ol the vaiious topics 
or themes u-scd by a person is comparecl w'lth a ficcjiicncs table for 
adolescents Svmonds' Adolescent Inniasy (19-i9) gives case matciial 
and elaboiatcs on the use ol this test 

Pictuic-Fuist}aiion Rosen/W'cig (1945) desenbed a pictiire-as- 
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sociation test which is in some respects similar to a word-association 
test and to a thematic-apperception test His stimuli contain both 
pictures and words, and the test administration limits the responses 
in both length and content. Each of twenty-four cartoon-like pictures 
illustrates a fairly common frustrating situation for one of two per- 
sons in the picture, who is always shown on the right side of the 
picture (Ulus. 173). On the left is the other person who is saying cer- 
tain words which either describe the situation more fully or actu- 


ILLUS, 173 ROSENZWEIG'S FRUSTRATION TESTS 



(Situation 2 is of the superego frustration type, the other three are ego-blocking 
types Copyright, 1948, by Saul Rosenzweig) 
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ally intensify the fuistrating situation for the |3crson on the right. 
Features and laciaJ expicssion are piirpo:>elv oniiitcd to lacilitate the 
projection ol feelings by the person being tested. Tlie stimulus 
pictures include sixteen ego-bloc king situations when the peison is 
directly frustrated, as by being splashed \Mth iiuiddy water by a car. 
Eight other pictures are intended to in\olve the superego by means 
of accusations, charges, or inciinuna.tions In the'>e attacks there is an 
implication that tlie blocking of the ego has aiicady occuiicd They 
add insult to in]ur\ • 

The subject, in order to show* his reaction, is asked to wTite in an 
empty space above tlie picture on the right the fust reply that conics 
into his mind “Avoid being humoioiis Work as quickl) as you can.“ 
Early experiences showed that comical icphes did not allow' the same 
type of scoring as seuous replies Speed is cmphasi/ed in older to 
avoid studied c\asions 

The examinci re*icls the words of the character on the lelt aloud, 
then asks the sub]ect to think of a reply and write it down 'Hie total 
for all twent)-four responses is recorded, whether in group or in- 
dividual administration If the achi illustration is for a single individ- 
ual, a subsequent period is used for having the sub|ccr read aloud 
the responses he has written. Tone of voice, iiianneis, and hesitations 
are noted and nonlcachng questions aie asked about \ciy brief re- 
sponses. 

Each response is scored on two scales — Diiectioii of Aggiession, 
and Ego Situation Aggiession or blame can take thi'ee directions 
(a) toward the envuonment, including othei persons, (0) toward 
self, and (c) toward no one, when it is claimed that no one is to blame, 
that the situation is not significant, or that just w^aiting wnil correct 
it. The Ego Siiuaiion may also take three mam foi'ms (a) obstacle- 
dominance — the “barrier” is emphasized, (/;) ego-doini nance — the 
defense of subject (ego) plavs the chief role, or (c) iicccl-peisistericc 
— the need foi solving the problem or foi relieving the situation is 
stressed. 

Sample answx*is deihcd from one hundred iioimal individuals 
and fifty mental patients are given In scoring a test the individual 
answers are classified and entered on a record blank wdiich has 
pnnted on it the expected scores of tw'elve items I’liese expected 
scores are used as cr itciia of noi'malcy, and a Group Confomniy Rat- 
ing (GCR) IS secured by computing the percentage of comjileie agree- 
ment with the ciiieria 'Thus if half of the subject's scores on the 
twelve items agree with the ciiteria, his GCR is 50 pci cent GCRs 
for a group of 50 normal adult men averaged 72, Joi 50 noi mal adult 
women 68, and for 50 mental patients 57 \ddiLional scores arc sc- 
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cured to show the frequency of each type of response Thus the 
median per cents for one hundred and fifty normal men and women 
were approximately 40 per cent extrapunitive (blame environment), 
30 per cent intrapunitive (blame self), and 30 per cent impunitive 
(blame no one) The median percentages were approximately 20, 
50, and 30 for obstacle-dominance, ego-dominance, and need-per- 
sistence respectively. Sex differences were not outstanding on these 
adult samples. Other scores were also computed for responses to 
ego-blocking and superego-blocking situations, and for trend or tend- 
ency to change from one type of response to another 

The interpretation of scores is not simple, but in general a para- 
noid tendency shows itself in excessive blame of others, and a de- 
pressed or guilt complex in blame of self Constructiveness is shown 
by high scores in need for problem solution. Emphasis on the role 
of the barrier is a poor or weak adjustment. High ego-dominance 
scores are typical of extremely selfish or schizophrenic patterns of be- 
havior. There are, of course, many combinations. The uses of the 
test are being investigated. 

Insight into Human Motives Test, Helen Sargent (1944) attacked 
the problem of devising a verbal paper-and-pencil test which would 
present somewhat ambiguous stimuli to subjects in such a way that 
they would reveal their own feelings and methods of perceiving and 
organizing material, without being aware of the purpose of the test. 
She chose the title. Test of Insight into Human Motives, in order to 
arouse interest and to mislead subjects as to the nature of the test. 
The test consists of items called armatures, because they are bare 
frameworks of conflict situations. Each item describes an individual 
in a conflict situation. The characteristics (except sex) of the imagi- 
nary person are not described, proper names are not used, and 
indications of feeling are avoided in most of the items. Thirty-six 
items were finally adopted, of which 24 were applicable to either 
men or women, 6 to men, and 6 to women. From these, two forms of 
fifteen armatures each were assembled for each sex. Sample items 
are shown in Ulus, 174. Recently two forms of ten armatures each 
have been recommended which produce sufficient data and simplify 
computations The six main areas of conflict sampled in each form 
are; family, opposite sex, social and friendship relations, vocation, 
religion or beliefs, and health. 

The subjects were tested singly and in small groups with no time 
limits About an hour was adequate for one form, and those who had 
already taken one form usually took less time on the second. Blank 
sheets of paper were issued with instructions to answer the two ques- 
tions which follow each armature, as part of a test to analyze what 
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ILLUS 171 SVRGICNrS INSIGHl IX’IO HUMAN' MO H\ LS rESl 


Fotm / Men 

I 1, A young man who is working* oi 
studying aua\ fiom home gcL^ a 
letter from hl^ moLliei afiei the 
death of his Lithci, asking him 
to moM* hack home 
a What did he do and whv^ 
h Ho\s did he fc*eP 

II ^ A young man has acqnned re- 
ligious and poliiical opinions 
away fiom home uhiih aic in 
direct conflin wuh his paieius' 
ideas 11c is home fui a \isir and 
religious and puliiical siib]e(ts 
are discussed 

a What did he do and uhv^ 
b How did he feeP 

III 3 A young man falls in lose In ol- 
der to mairy he miisi give up his 
studies and make sonic money 
immediately 

a Whal did he do and why^ 
b How did he tceP 


IV 4 A young man gets a good deal of 
razzing because he spends his 
week-ends at home instead of 
dating 

a Whai did he do and why^ 
b How did he feeP 


form / IT otnrn 

II \ gill who IS w'oiking 01 study- 
ing aw as (lom home gets a leitei 
fiom liei mother, aftci hei fa- 
thers death, asking her to move 
hack home 

a \\ hat did she do and why^ 
h How did slie fteP 

II 2 \ girl has ac(|niicd leligioiis and 
political opinions awav fiom 
iiome wlucn ate in diiect conflict 
with hci paieiits’ ideas bhe is at 
home foi a visit, and leligiotis 
and political subjects aic dis- 
cu'‘scd 

a \\ hat did she do and why^ 
b I low did siic feeP 

III 3 \ girl gets the impic»ssion that 
others aie discussing hci On sev- 
eial occasions she thinks convci- 
saiion has stofiped or the subject 
ch<uigcd when she ciiteis iJic 
room 

a \\'hat did she do and why^ 
b How did she feeP 
I\ 4 A gal IS disap])iovccl by her 
fi leads because she spends hei 
week-ends at home instead of 
dating 

ft \\ hat did she do and why^ 
h How did she feeP 


V 


5 A young man’s fa i her has always 
looked fotwaid lo having his son 
take ovei liis business and has 
educated liim lo' it The son lic- 
comes interested in another vo- 
cation 

a What did he do and why^ 
b How did he feeP 


V T -V gills* paicnts have always 
looked foiward to having hci 
follow a paiiicidai careei and 
have educated her for it She be- 
comes intei Gated in something 
else 

a \\ liat did she do and why- 
h How did she leeP 


(By perniission of IJi Helen Saigent and the ecluois of P^ydwlogttal 
Hex‘icw Monos^taphs') 


people do and fed under various circinnslanccs. The sub]ccts were 
asked to write for an hour, and to write fiist on ibc most interesting 
Items since they might not have time lo finish all of them It was 
pointed out that there aie no right and wrong answers, but that 
the explanations should show insight into the character. The test 
also may be given and answered orally 
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In developing a scoring system a large number of variations were 
noted, such as number of questions answered, number of lines 
written, conflict areas, emotional words, elaborations, irrelevant state- 
ments, conflict solutions, philosophizing, and cliches From the 
answers given by forty-five volunteer students from college psychol- 
ogy classes and twenty patients at a state hospital, scoring categories 
were prepared. Later 'ivhen additional data were available these were 
revised. These categories were much influenced by Murray^s analyses 
of “need** and “press,” but Sargent felt that fewer, less-overlapping 
categories were preferable. She did a great deal of careful work in 
defining categoi les, checking the reliability of scoring, and determin- 
ing the stimulus value of each armature. 

In scoring and interpreting a record, the scorable phrases are 
identified and classified into categories, then raw and weighted scores 
are computed and changed to standard scores Lastly standard scores 
and ratios are compared with norms. Three main categories (A, E, 
and M) are used. The twelve A categories (Ulus. 175) include all 
affective reactions, including such feeling verbs or adjectives as “She 

ILLTJS 175. SARGENT'S SCORING CATEGORIES 

A. Affective expressions regarding the cejitral character 

1. FrustiaUng. She felt trapped His death was a shock 

2. Challenging: 1 he job struck him as a challenge 

3 Aggressive He wanted to get ahead m life 

4. Passive She just had to take it. 

5. Evasi\e She put it out of her mind 

6. Depressi\e. He felt much discouraged 

7. Pleasant So glad to see them 

8. Positive* She wanted to help all she could 

9. Negative: He didn’t like her attitude 

10. Guilt or inadequacy: She felt self-conscious. 

11. Conflict and confusion He didn't know what to think. 

12. Rationalization: She couldn’t help it. 

E. Ego activity 

El. Elaboration Additions to the armature In the end he was fired The 
father had left a lot of money 

Ev. Evaluation General evaluative statements Children owe a duty to their 
parents. He should . . 

Q, Qualification At first, but if, probably. 

M Maladjustment indicators’ 

Ir Irrelevant feelings, inappropriate to the content. 

Subj Subjectiveness. Highly personalized* It’s not God’s method. 

PP First person pronoun I, me, my, myself 

Un Unreal solution. 

Zero. No working out of the problem. 

(By permission of Dr Helen Sargent and the editors of Psychological Review 

Monographs ) 
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felt miserable.” These are signs of arousal of affect or emotional 
sensitivity. The three E categories include elaboration, evaluation, 
and qualification, such as “Children owe a duty to their parents,” 
These are thought to represent ego activity and to show stiength of 
ego. Experience has shown that the normal relationship between A 
and E is an approximate balance in standard scores on this test. 
Hence the ratio (A/E) 100 is approximately 1.00 Ceitain evidence 
shows that among well-adjusted persons the A. may be consider ably 
higher ih-m the E, because iliev feel little need loi CtUiiioii oi 
dclcnse li Iroih A and E aie vei) high, iheie is some possibilU) that 
ego contiols aie in daiigci oC bieaking Vi \eiy low A le\cls, found 
only in schi/ophrenics, the E acimty is lelativeh high, allhough 
the E nia\ rcpiesem onU a lesidiial ol ego actiMtv Hie A/E rating 
IS, thcieloie, \Mtli ceriain qualihcations an induation ol ego cllort 
to o\cicome anxiety and fnistiatiori Ihc Insight "J e^t siipjdements 
the Rorschach and T.VT by providing a c]iiaiititati\c index of the 
amounts ol affect aroused, the intellectual delense actiMty, and the 
balance between these 

The -M categoiy includes c\iclences of maJad)usLment such as ir- 
relevant feeling, introduction ol subJecti^e or bi/ane material, too 
little Ol loo much use of the pronoun I, umcal oi illogical solutions 
or no solution, and /eio categoiics or zero arniatiucs. 

Saigent concluded that there is stiong evidence of projection 
brought out by the use ol this paper -and-pcncil test, that rhe scoring 
IS reasonably reliable, and that the test yields valid indicators of 
balance, as shown by tentative results from various clinical and 
normal groups 

Fables Test 

Reuben Fine (1919) reported a study of one hundred children 
between the ages ot fotu and four teen years using his revision of 
I- Louise Dcsj)en*s Fables lest In this test twenty short unfinished 
stones are read to a child "ivho is asked to make up the end ol the 
story About 20 minutes is usually needed. Samples oi these stories 
are (Fine, 1949, p 106) 

1. \ daddy bird and a inomin) bird and their little birdic aio asleep m 
a nest in a tree All ol a smiden a big wind blows, U sli.ikes the tree and the 
nest lalls to the ground The three birds awakc'ii all of a sudden. 1 he daddy 
flies quickly to the pine tree, the inoniiny to another pme tree, the little 
bird knows how to lly What is the little bird going to do^ 

The major behavior variables tested by the fables were listed by 
Fine as dependency, hosLility, identification, sibling rivalry, reac- 
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tions to parental rejection, castration fears, and the Oedipus com- 
plex. He gave several protocols which illustrated how the Fables 
Test was a useful supplement to the Rorschach. Thus the Rorschach 
scores for one nine-year-old boy showed a high degree of compulsion 
and suppressed spontaneity and rejection of Card VI The Fables 
Test showed intense hostility toward the father; his solutions con- 
sisted of running away and of growing up quickly By comparing 
the Fables Test scores of thirty asthmatic children with their thirty 
closest siblings Fine discovered three marked trends which were 
statistically significant* (1) the asthmatic children were a little more 
dependent upon the mother than their siblings, (2) the asthmatic 
childien were also much more hostile to the mother and less hostile 
to the father than their siblings and (S) the trends were a little more 
significant for boys than for girls in these small samples 

The results are similar to those derived from other studies of 
asthmatic children which used more elaborate approaches. The ad- 
vantage of the Fables I'est is that from a rapid but systematic pro- 
cedure several scoies showing hostility and dependence can be 
secured in a friendly informal situation. 

Blacky Test 

Blum (1949) described a test which seems to have considerable 
reliability in indicating psychosexual development, but which was 
specifically designed to yield evidence regarding several of the 
Freudian concepts. A series of twelve cartoons was prepared featur- 
ing a pup named Blacky, his parents, and a sibling pup named 
Tippy. The cartoons were presented to 119 men and 90 women col- 
lege students separately. When presented to men Blacky is described 
as a son, and when shown to women as a daughter. Each of the 
cartoons is designed to portray a stage of psychosexual development 
or a type of object relationship as described in Ulus. 177. 

Each cartoon (Ulus. 176) was thrown upon a large screen, and 2 
minutes were allowed “for you to make up a little story about what 
is happening and why it is happening, and so on Since this is a 
sort of test of how good your imagination can be, try to write vividly 
about how the characters feel. . . It is desirable to write as much 

as possible within the time limits.'' Each cartoon was introduced 
orally with some nonleading comment, except Cartoon III, “Here 
Blacky is relieving himself (herself)," and Cartoon V, “Here Blacky 
is discovering sex.'* After the first presentation, an inquiry period 
allowed the answering of about seven multiple-choice questions on 
each cartoon and direct questions requiring one or two short sen- 
tences (Ulus. 178). Upon finishing the inquiry, the subjects were 
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THE ADVENTURES OF 



(By permission of Dr. G, S. Blum and the Editor of Genetic Psychology Mono- 
graphs,) 

ILLUS. 177. SUBJECT OF BLACKY CARTOONS 

A. Four heads of dogs identified as Papa, Mama, Tippy, and Blacky; used for in- 
troduction (Ulus. 176). 

I. Oral Eroticism. Blacky is suckling the mama dog. 

IL Oral Sadism. Blacky is biting a dog collar marked “Mama.** 

HI, Anal Sadism: Retention or Expulsion. Blacky has defecated near two 
large kennels marked “Papa’* and “Mama.** 

IV. Oedipal Intensity. Papa and Mama are flirting while Blacky looks on. 

V. Masturbation Guilt. Blacky is licking own genital region, 

VI. Castration Anxiety (Males), Penis Envy (Females). Tippy is about to 
have tail cut shorter. Blacky watches. 

VII. Positive Identiflcation. A large dog threatens a small wooden toy dog. 

VIII. Sibling Rivalry. Papa and Mama show affection for Tippy. Blacky watches 
at a little distance. 

IX. Guilt Feelings. Blacky cringes; an insert of a small dog with wings points 
an accusing finger at him. 

X. Positive Ego Ideal (males), Love Object (females). Blacky dreaming sees a 
handsome large black male dog, 

XI. Positive Ego Ideal (females), Love Object (males). Blacky dreaming sees 
a large black female dog. 

(By permission of Dr, G. S. Blum and the Editor of Genetic Psychology Mono- 
graphs,) 

asked to indicate degree of liking for each cartoon, using the letter 
L for like and D for dislike. Then the cartoon liked b^t was to be 
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ILLUS. 178 BLACKY TEST INQUIRY FORM FOR CARTOON VI (MALES) 

L How does Blacky feel here? 

a Terrified that he is going to be next 

b. Purzled and upset 

c. Curious but calm, 

2, What does Blacky suspect might be the reason for the scene^ 
a Tippv IS lieing punished for having done something wrong, 
h. Tippy is tlic innocent victim of someone else*s ideas. 

c Tippy IS being improved in some way. 

3. How does Blacky feel about his own taiP 
a, He*s not particularly woined 

b* He’s thinking desperately about a w’ay to save it. 
f. He thinks he might look lietter with it cut off, 
d He’s so upset lie wishes he never saw or heard of tails 
4 Do \ou suppose Blacky ivould prefer to have his own tail cut off right away 
rather than go through the suspense of wondering if it will happen to him^ 
Why 3 

5. Which memlier of the family most likely arranged for Tippy’s tail to be cut off? 

6. What w’lll other dogs in the neighboihood do when they see Tippy’s short taiP 

а. Start worrying about their owrn tails. 

б. Make fun of Tippy 

c. Wonder what’s going on. 
d Admire Tippy 

(By permission of Dr. G S. Blum and the Editor of Genetic Psychology Mono- 
graphs,) 

chosen and a short statement written of the reasons for liking it best. 
Last, the one liked worst was to be selected and the choice explained. 
The test was scored by determining for each person the degree of 
involvement in each of the categories listed in Ulus. 177. The involve- 
ment was determined to be strong ( — ), fairly strong (-), or weak or 
absent (0) on the basis of the spontaneous story, the inquiry, the 
cartoon preference, and related comments. Illustrations in Blum 
(1949) of strong involvements are* 

Cartoon I. Oral Eroticism: Blacky has just discovered the delightful 
nectar that Mama can supply — it is an endless supply and she is enjoying it 
She doesn’t know where it comes from, but she doesn't care. Mama is pacific 
throughout it all, and so forth. 

Cartoon III, Anal Expulsiveness Blacky, still frustrated, shows his con- 
tempt of Mama by leaving a pile of defecation near her house “There,” he 
IS probably thinking, “that will take care of herl” 

Illustrations of weak involvement are. 

Cartoon I Oral Eroticism: Blacky, a male pup of a few weeks, is having 
his midday lunch. Mama is bored with the proceedings, but as a mother with 
her maternal instincts is letting Blacky have his lunch to Blacky’s satisfac- 
tion. 

Cartoon III. Anal Expulstveness: Blacky was not too slow when it came 



STORIES AND FANTASIES 


525 


to housebreaking. It took him little time to learn that he must relieve him- 
self outdoors. Outdoors he went when the occasion demanded, unconfined 
and relieved 

Blum found that differences between sexes among college students 
generally supported psychoanalytical theory. Thus the most promi- 
nent diffciences between the se\cs in unv scoies weic ih.it women 
exceeded men jn oial sadism, ambiv.ilence, and aggrcssi\ cness lo- 
waid the patent of the same se\, geiieial guilt feelings, sujicicgo with 
raotheily chaiactciistics, blaming orheis, and in being narcissistic 
in Jo\e-objcct rlioice, and that men exceeded women in cxpicssion 
of oedipal intcnsit), identification tow^aid paieiit of same sex, super- 
ego wuth fathcily char actenstics, blaming no one, and in being more 
hopeful of attaining ego ideal 

'1 he icaiilts are also treated by Blum to show' intercoi relations be- 
tw'een the categoiies shown in Tlliis. 177. Although many of the coi- 
leJations are not significant!) different fiom zero, all the significant 
corielations tend to support the evidence piescntcd b\ many non- 
Ficiidians as well as by Fieiidians that distui bailees at one stage de- 
lay or pi event successful completion of a subsequent stage, and may, 
if serious enough, cause a regression to or condensation with an 
earlier stage. Hie dominating chaiactciistics of earlier stages nearl) 
alwa)s persist along with the later stage to some extent. 

In conclusion, all of the significant test findings show' agreement 
with psychoanalytic theory whcie specific evidence is available In 
addition Blum points out that theie aie many interesting points ot 
comparison which have not yet been specifically leported in freudian 
wTitings, and that such systematic compar isons may extend the knowl- 
edge of psychosexuai behavior in many important wa)S. Blum also 
notes the need of much more research with other groups and on 
v^ilidating the technique itself. 

Group Projection Sketches 

Recent applications of tecliniqucs similar to the I’AT to the study 
of small groups have been icpoi ted by Ilorwit/ and Cartwright (1950) 
and by Hemy and Guetzkow (1950) The way an interacting group 
tells a stoiy about a pictuic is anal)7ed to provide eMdence of the 
structure and dynamics of the group itself 'Ihis tcchnhpic has an 
aclvaniage o\er such methods of group obsersation as that done by 
teams of direct observers, transcribing lecorded discussions, and the 
use ol socionietiic analysis It is considerably less Liiiie-consuming and 
)ie]ds a faiily objective result of group activity w'hich is adaptable to 
quantified study It does not re\eal to the group members the signifi- 
cance of then responses 
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Henry and Guetzkow (1950) present five 18 by 21-inch sketches, 
one at a time, with the request that the group compose a story about 
each; “Tell me what is happening in the picture, what happened 
in the past to bring this situation about, and what is going to hap- 
pen in the future.” Usually a group produces a story in about 10 
minutes; it is written by a member of the group or by a stenographer. 
The five pictures are designed to educe different group responses as 
follows: 

Sketch 7. A group of six young men are around a large table which 
has some letter-size sheets of paper on it, and one man is standing with his 
back to the group. This picture is designed to elicit thought content and 
feelings about divisions of labor, roles within a group, and types of group 
functions. 

Sketch 2, A man leans in a leisurely manner against the side of a door- 
way and looks out at a landscape. This picture often reveals the group's 
attitude to a lone individual, and toward his inactivity, and the group's 
concepts of his inner drives and environmental pressures (Illus. 179). 

ILLUS. 179. GROUP PROJECTION SKETCH NO. 2 



(From Henry and Guetzkow (1950), Used by permission.) 

Sketch 5. An older man leans toward a younger man in a dose face- 
to-face position. This picture cfften elicits the group's feelings about au- 
thority and its own use of authority. 

Sketch 4. A middle-aged woman sits in the foreground with a puzzled 
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or worried expression; a younger man in the background, holding an ob- 
ject in his hands, is looking at her. This picture is designed to reveal con- 
cepts of dependence and the breaking of established relationships and ties. 

Sketch 5. Four middle-aged men, two sitting and two standing, are fac- 
ing one another in a comfortable clubroom or li\ ing room Tins picture 
usually discloses feelings toward informal groups of persons in authority, 
toward new developments, and toward sinceiity in a group task. 

The stories that were told by two groups are given here for Sketch 
2, The Man in the Doorw’ay. 

Group I felt this to be a farm scene. The man in the door is pensively 
looking at the sunset contemplating his future He has ]u$t returned from 
college where he has been graduated and has to decide whether to take 
graduate work, stay on the farm, or accept a job in the city It is this problem 
that he is thinking about. 

His decision will be to travel during the summer and then take up grad- 
uate work m the fall when school opens again. He is a thoughtful, serious 
individual 

Group II thought that this was a young man with time on his hands. 
April Undecided as to whether he should leave or not. Waiting for some- 
thing to happen which will help him to make up his mind. 

To interpret these stories two procedures are suggested one a 
clinical approach showing content and themes and the other a rating 
approach using eighteen categories. The first procedure is illustrated 
by the following analysis of the first story given above: 

The task-orientation theme already noted in the first picture recurs 
(making vocational choice) The story reemphasizes the staff-function of 
the group (he takes school work rather than working on farm or in city). 
The work of the group consists in making plans which will be executed by 
the individual (“contemplating his future**) Again, there is evidence that 
the group reaches decisions (definite conclusion to story, as in Picture 1). 
Yet, it is not too well motivated (“pensively looking,’* “thinking about” — 
indirect statements, “It is this problem tliat . . •** — ^and the leisure evi- 
denced in “when school opens again”). It allows itself time for enjoyment 
while pursuing its goal (“travel during summer” — similar to social fraternity 
setting m Picture ,1). The relative youthfulness and inexperience of the 
group IS again evidenced in the reference to **just returned from college” 
and Its felt need for further training (“take graduate work”). The lack of 
intense conflict and emotionality indicates that the group is free from in- 
tensive internal friction or strife (Henry and Guetzkow, 1950, p. 8) 

The group which produced this story consisted of five division 
heads in a personnel-planning department which was still attempt- 
ing to determine its role in a large corporation. The second story was 
made up by members of a group of branch managers in a large civil- 
service organization. 
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The second type of interpretation is made by having trained judges 
rate the group, after reading their stories, on a 6-point scale in the 
following categories. The authors recognize the exploratory nature 
of their schema. 

L Soctodynamtcs' the interrelations of people within groups and their 
concurrent emotions. 

a. Communication clarity shown by unequivocal statements and clear 

outcomes 

b. Content-procedure ratio the relative emphasis on the content or 
problem, and on the time spent in discussing what procedure the 
group should follow in making up the story. 

c. Information piovtdtng the amount of new information which the 
group provides 

d. Goal concentration: the degree to which the plot outcome and 
hero or other figures are integrated in the story is taken to indi- 
cate the concentration on a unified goal 

e. Problem source' the extent to which the group feels the problem to 
be one of its own, or one forced upon it by some outside agent 

/. Value orientation: the kinds of group goals indicated by the stories 
are classified under six headings* 

1) Achievement' definite plans made for considerable period. 

2) Learning and contemplation considers many implications. 

3) Product' quality and amount to be or actually produced 

4) Persuasion: some of the group try to convince others. 

5) Advisory: weighs alternatives but recommends action. 

6) Fact-finding: interested in getting more facts 

g. Tension level the amount of tension or energy from sluggish to very 
alert and active, 

h Tension direction the amount of support given each other or the 
expenditure of energy in conflict 

t Pacing level, the rate at which group operated in discussing and 
reaching decisions 

] Personal interdependence: how much each individual depends 
upon others in the group 

k. Personal affect' how much the group members look upon each other 
as friends ratlier than just members of a group, 

2. Group Structure: 

a. Participation spread, the degree to which all members of the group 
take part. 

b. Role differentiation' the extent to which tlie members of the group 
perform different functions. 

c. In-group feeling, the degree to which the group distinguishes be- 
tween those present and those outside, and the possible intrusion 
of out-group persons. 

d. Individuality of members, the extent to which the individual places 
the group's activity above his own personal goals. 
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3. Process Outcome: 

a. Quality of group product 

1) Reality o'lieutation the degree to which the group’s actual sit- 
uation is considered in the story 

2) Organization of outcome^ the extent to wdiich tlie outcome is 
well organized and coherently presented 

3) Creativity stereot}ped \ersus original thinking 

b Group satisfaction with outcome, satisfied or not with the ston The 
solution of conflicts and the quality of endings are taken as evidence. 
c. Motivation to execute outcome ready and %villing \ersus unwilling 
to carry out a solution or to continue to find a solution. 

The categories listed above are described in more detail by Henry 
and Guetzkow (1950) and evidence from stoiies is given to illustrate 
each. The authors also point out that much valuable supporting 
evidence can be secured from observations of the group and knowl- 
edge of its composition and place in an organization. Ratings in 
these categories lead to quantitative descriptions of groups and form 
the groundwork for objective analysis and a basis for the preparation 
of norms. An interesting comparison can be made between these 
categories for group structure or functioning and the categories used 
in the TAT or Rorschach techniques. Both group and individual 
analyses deal with the strength of somewhat independent persons, 
their different needs or goals, their tendency to work together or to 
have conflicts, and the resulting patterns of bcliavioi. 

The S/ondi Test 

T-ipot Szondi, an Hungarian psychologist, began in 1930 to develop 
a pioceduie for evaluating ns lest stimuli the photogiaphs of persons 
known to be extiemely abnoiiiial The piocedure, as described by 
Susan K Deri (1949), requires that loriy-eight photographs be pre- 
sented to a subject, eight at a time From each set of eight the subject 
IS asked to choose the two he likes the most and the two least liked. 
After all foity-eight pictures have been viewed, the twelve which the 
subjec! liked the most (he may not have liked any of them) are again 
presented, and he is asked to choose the four that he likcb the most 
among the twelve Similarly the twelve least liked are again pi evented 
to allow a selection of the four most disliked The test, which takes 
only 5 minutes, is administered to the subject eight or ten limes on 
different days 

Each set of eight photographs contains one of each of the follow^- 
ing a homosexual, a sadistic murderer, an epileptic during a quiet 
period, an hysteric, a catatonic, a paianoiac, a depressive, and a 
manic person Of course the lay subject does not know this, and even 
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well-trained clinicians have seldom been able to identify correctly 
the clinical syndromes presented in more than three or four of the 
pictures. 

The scoring simply records the number of photographs liked or 
disliked in each syndrome or category. The interpretation is based 
on Szondi’s belief that each peison has eight areas of adjustment 
which correspond to the eight categories, and that a fairly good bal- 
ance among all eight is essential to mental health. Szondi’s experience 
with the test led him to think that an unusual number of likes for 
photographs in one categoiy is usually associated with repression and 
tension in that area, because of inability to find normal outlets there. 
When none or only one picture is chosen from a category, however, 
Szondi interprets this to mean that the subject has little or no ten- 
sion because he has developed socially acceptable outlets in this 
area. Szondi found that disliked photographs were often chosen 
from the same category as the liked photographs. He found evidence 
that dislikes indicated the same sorts of repressions as likes, and a 
large number of dislikes imply that the tension is near manifesta- 
tion Deri, a psychoanalyst, presents several fairly complete cases and 
an interesting discussion of the possible contributions of this test to 
diagnosis. She believes that the test may contribute uniquely to a 
revelation of a subject’s personality, because it seems to tap uncon- 
scious levels of behavior, and requires neither language nor move- 
ment. 

Single-Word Associations 

Following the technique of Jung, Kent and Rosanoff (1910) se- 
cured the first response to one hundred common words from one 
thousand normal persons (IIIus. 17) The subjects were simply in- 
structed to give, after a stimulus word, the first word which came to 
mind. The frequency of each response to each word was secured, and 
an individual’s score was calculated to show the proportion of un- 
usual responses. They found that among average normal persons 7 
per cent of the total responses were unusual, and that 247 mental 
patients gave an average of 27 per cent unusual responses. The list of 
Kent and Rosanoff has been widely used among groups of college 
students, feeble-minded persons, Negroes, and school children, so 
that per cents of unusual responses are available for these groups 
However, the variety of explanations of unusual responses makes Ae 
interpretation of these results difficult. Rosanoff (1920) and O'Con- 
nor (1934), working with what they thought to be fairly average adult 
samples, believe that unusual responses are largely due to unusual 
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modes of behavior, such as autistic thinking and a tendency toward 
extroversion or introversion. McClatchy (1928) and several others 
advanced the theory that persons with a large number ot unusual re« 
spouses are generally those with a large vocabulary or high intelli- 
gence. This IS certainly true of college and high school groups. Wheat 
(1931) found that those with low intelligence also gave a larger pro- 
portion of unusual responses than the average. This was because of 
lack of understanding of the stimulus word or lack of ability to vary 
responses. 

Another standardization of results of a free-association test has 
been described by Wyman (1925). On the basis of teachers' ratings of 
interest in “intellectual, social, or activities programs," three groups 
of children were selected. The word-association responses of each 
group were tabulated and compared. Three keys were then devised to 
allow three scores, one for similarity to those with intellectual inter- 
est, another to those with social interests, and a third to those with 
activity interests A more elaborate study of a similar sort was re- 
ported by Kelley and Krey (1934). In this case both teachers' and 
pupils' ratings were used to select groups of children who would be 
characterized by “courtesy, fair play, honesty, loyalty to fellows, 
mastery, poise, regard for property rights, and school drive," Word- 
association tests were applied, and keys constructed to give scores in 
these eight divisions. 

On subsequent applications, both Wyman's and Kelley's tests 
proved to have little discrimination. Other groups of children, when 
rated and tested, showed low correlations between ratings and test 
scores. Kelley applied Hotelling's technique to the analysis of tests 
and ratings and discovered two main components which seemed to 
him to correspond to general social conformity and to assertiveness. 

These tests of free association have in general been disappointing, 
but the basic assumptions and techniques are probably worthwhile. 
The failure to be discriminating is probably due to the shortness or 
fragmentary nature of the answers One is olten left in doubt as to 
the nature of response when only the first word is considered II a 
large number of responses were allowed for each sninulus v\ord, as 
is the case in appraising the Rorschach ink blots, the classification of 
responses wouJcl be much more accurate 

Meltzer (1935) investigated children's associations with parents by 
encouraging ihe children to think aloud after each stimulus word 
In the list of stimuli w'ere the words father and mother. Unlimited 
responses to these two words were classified Cor pleasantness, degree 
of attachment, level of socialization, and other parent-child relation- 
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ships. He reported 82 per cent agreement on the classification o£ 
responses by several investigators. 

More recently Rapaport, Gill, and Schafer (1946) have pointed 
out that the first quick response to a stimulus word is a reflection of 
the relative strength of the id and the ego. Hence they are often more 
inteiested in the effective behavior of the individual than in the reac- 
tion word. After the test is given a standard inquiry period is used to 
seek reasons for the behavior noted. They have reported significant 
differences among 151 patients in number of 

а. Close reactions, which indicate difficulty in the associative proc- 
ess; i.e., stimulus word repeated, self-reference, attributes, or naming 
of present objects. 

h. Distant reactions, which indicate difficulty in the synthetic proc- 
ess, such as little or no connection between stimulus and reaction 
word. 

c. Content analyses, unusual disturbances around words which 
have particular connotations, such as sex, aggression, family, and 
food, 

d. Disturbances in answering, such as long reaction time, failure to 
react, voice, and gestures. 

Sentence Completion 

A type of test for appraising insight and adjustment in which one 
writes a few words to complete an unfinished sentence was described 
by Payne (1928), but it has not been developed thoroughly or used 
widely to date. However, it has the following advantages 

1. Is suitable for administration to groups or to individuals of 
twelve years of age or more, 

2. Samples a wide variety of provocative situations, 

3. Has no time limits, 

4. Provides material that allows fairly objective scoring methods, 

5. Supplies more information than a single-word association test, 

б. Keeps the subject unaware of the method of scoring or the pur- 
poses of the test. 

To make use of all these advantages, however, each item must be 
carefully constructed and evaluated in practice. In the best forms, 
much effort is given to avoid items which tend to yield single-word 
responses or stereotyped completions. The item is, therefore, usually 
vague or unstructured, and the subject of the sentence changes from 
first person to other persons. Illustration 180 shows twenty items from 
Rohde’s (1946) test, and the answers given by a fifteen-year-old boy 
with an IQ of 114. 

The content of most sentence-completion tests tends to embrace 



STORIES AND FANTASIES 533 

ILLUS 180. ROHDE-HILDRETH SENTENCE COMPLETION TEST 

(Sample of 20 items and answers. The complete test has 64 items. 

From Rohde. 1946, p. 173) 

Subject A Age. 15 >rs , 7 mos. IQ 114 

1 I want to know tf all have the same feeling for art as I do, 

2. The futuie seems veiy bright and cheerfuL 

3 My school work has been very interesting to me. 

4 Earning my living is a thrill 

5 My greatest longing is to paint 

6 Secretly I steal food from the pantry. 

7 If I fail in algebra, I would practically **die ” 

8 There are times when / feel like running away and start a new life. 

9. Work IS pleasant and hard. 

10. Friends are a help and encouragement. 

11. I become embanassed when I can*t dance. 

12 Girls fascinate me 

13 Love is grand! 

14. Other people are quite interesting. 

15. The laws we have are sometimes unjust. 

16 I cannot understand what makes me so nervous and stammer* 

17. My stomach is fine and holds a lot. 

18. At night I study 

19 My mother is dead! 

20. Death is sometimes inviting. 

(By permission of Amanda R. Rohde and the editor of the 
Journal of Applied Psychology.) 

many areas of activity in an attempt to discover the subject's prin- 
cipal drives or needs, his self-ideals, his degree of success or failure, 
and his attitudes toward self, others, and the world. One of the most 
thorough approaches is the 100-item sentence-completion test of 
the Office of Strategic Services (1948) which was designed to shed 
light on twelve areas of adjustment, which were described somewhat 
as follows 

1. Family, relation with parents and siblings 

2. The past; childhood and early events 

3. Drives; major motivating forces 
4 Attitudes toward self 

6. Goals; conscious objectives, self ideals 

6 Cathexes; likes and interests, objects and ideas 

7 Energy, productivity 
8. Reaction to frustration 
9 Time perspective 

10. Optimism, expectation 

11. Reaction to inferiors, equals, superiors 

12. What he thinks others think of him 
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Shor (1946) reported a 50-item Self-Idea Test for military use, in 
which 15 per cent of the items were concerned with military life, 35 
per cent with attitudes relative to situations which could be either 
military or civilian, and 50 per cent with general life situations Shor 
developed a sequence of items which had “shock absorbers” planted 
between the moie emotionally chaiged items, and a sequence in- 
tended to develop greater emotional involvement Much more study 
is needed on the effects of the sequence of items 

The scoiing of sentence-completion tests alw^ays involves apprais- 
ing uniqueness of responses, their logical sufficiency, personal refer- 
ence, and repetitions of responses* Rohde (1946) and Sanford et al 
(1943) anah/ed and lated the number of need, press, and inner states, 
according to intensity on a 3-point scale of intensity. Rotter and 
Willeiman (1947), using a 40-item test in the Army, analyzed the 
number and rated the degiee of the following 

1. Conflict or unhealthy responses w^ere given plus values from 1 
to 3. Thus: “I can*t think straight"* was given -f-3. 

2. Positive or healthy responses w^ere given minus values from 1 to 
3, Thus* “Other people are swell* was scored — 3. 

3. Neutral responses w*ere given no score. 

These authors published an illustrative set of scoring standards for 
each item w^hich make possible fairly objective scoring. The stand- 
ards are based on answers from 15 patients with serious psychiatric 
disorders, 15 with combat disturbances, and 15 with no serious psy- 
chological problems. Rohde (1946) used 670 ninth grade students of 
about fifteen years of age. On all of these scales a great deal more 
work is needed to establish reliable norms. 

The reliability of sentence-completion tests is reported to be fairly 
high. Rohde (1946) found after 8 months a retest consistency of .82 
with girls, and .76 with boys. Rotter and Willerman (1947) found a 
split-half correlation of .85 when corrected by the Brown-Spearman 
formula, and an average inter-scorer reliability of from .81 to .91 
among seven scorers on a population of fifty convalescents. 

Rotter and Willerman (1947) found a correlation of .61 between 
test scores and a judgment of “severity of disturbance” based on case 
studies. They also reported correlations of from 39 to .41 between 
test scores and psychiatric diagnoses. These findings indicate that this 
type of test will yield a fairly stable and meaningful score. Much work 
is needed to determine its best form for various groups. 

Proverbs 

Rabin and Boida (1948) selected forty-one proverbs from a larger 
number which were thought by five psychologists to have possibilities 



STORIES AND FANTASIES 


535 


in tapping deep psychological experiences of the subject. Eight of 
these pioveibs were. 

1 It takes two to make a qihirrel 

2 An idle brain is the de\ il’s workshop. 

3 A lound peg iii a square hole 

4 A man alone is eithei a saint oi de\il. 

5 Mairiages arc made in hea'ien 

6 All truths arc not to bo told 

11 V bad woman is woise than a bad man. 

38 A contented mind is a continual least 

Two gioiips, one oi hospital patients and the othei ol muses, were 
asked to indicate the ten “best" piovcilis Tlie patients show signifi- 
cantly more iDrefereiue (or the last thice items above Niiinbeis b 
and 38 seem to be i elated to paranoid tieiuls and an\iety, but num- 
ber 14 seems to be piincipall) a sex clifreientc C.ompauson of in- 
dividual patients’ choices with thcii situations often indicated that 
certain proverbs had direct or indirect icfeience to failiiies, anxieties, 
and preoccupations An inquiry period following individual tests 
yielded evidence ol the types of behavioi siniilai to those yielded in 
the inquiry period ol the Rorschach, lor example, coiistriiction, con- 
fabulation, perseveration, and selt-icference. 

STUDY GUIDE QUESTIONS 

1 How can apprei latioii of literary st>]e be measured most accuiately^ 

2 ITow tan ihe plcasantnc^s of spceili sounds be evalnaietl- 

3 What is the rationale for the use of pioveihs in personaht) cx.iiniiiiiig^ 

4 On v\hat basis were the stimuli used in the TAT lest chostn- 

5 Whai aie the advantages of using pictures, rathci than inkblots or 
pioverbs as stimuli^ 

6 To what extent do the directions for administration of the TAT 
disclose Its purpose and stimulate the subject to cooperate^ 

7 On v\hat basis has Muiiay classified needs and press' 

8. What sorts ol scores and suimnanes are used in interpreting TAT 
results^ 

9 \Vhat syndromes of needs did Sanford discover among children' 

10 In what ways did alcoholics sh(,nv personality traits different from 
normals^ 

11 What diffctences did Fion fir^icl between TAT themes ol schi/o- 

phrenics and of normals^ / 

12 What are basic scoiing categjoncs for the Picture-Frustiation lest, 

and how arc they measuiccP j 

13 What scoring categories did Sargeant describe^ 

14 What are the basic assumptions of Blum in preparing the Blacky 
Test? 
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15. What scoring categories are proposed for group characteristics as 
shown by group stories about pictures? 

16. What are the advantages and disadvantages of single-word stimulus 
and response tests^ 

17. What categories can be used in scoring sentence completions for 
personality characteristics? 

18 Make a chart comparing the scoring categories of the various types 
of tests. 



CHAPTER XIX 


PLAY AND DRAMA 




INTRODUCTION 

Three main divisions are used below in describing projective meas- 
ures employing play or drama as techniques, (a) play in normal situa- 
tions; (b) role playing; and (c) miniature stage settings. All three 
divisions seem to yield similar types of evidence regarding the use of 
space and materials, thought content, and emotional involvements. 
The subjects show by language and other expressive movements their 
defenses, abilities, and feelings. Several investigators have pointed 
out that both children and adults often make toys do what they 
themselves would actually like to do but dare notl 
The principal problems of measurement in this field are to define 
accurately the phenomena being observed, to establish degree or in- 
tensity of involvement, to determine what part of the activity is 
symbolic of deeper individual tensions and what part is merely a 
reflection of cultural stereotypes, and lastly, to discover the variabil- 
ity that is typical of individuals and of groups. 

PLAY SITUATIONS 

The use of play situations for therapeutic purposes is probably as 
old as the human race. All ages and both sexes use play to relieve 
tensions and develop confidence. Play has been defined in several 
ways, depending upon the type of activity involved The physical 
activity employed may be marked, as in playing tennis, or merely 
incidental, as in talking or thinking games. The social aspects of 
play vary from solitaire to situations requiring fine team cool dina- 
tion. The ideational content varies widely — ^from stnct adherence 
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to definite rules, as in card games, to the wildest fancies, found in 
various types of unstructured play Kanner (1940) has given a num* 
her of reasons or explanations for play, namely, (a) recapitulation of 
racial experiences or instinctive patterns, (6) expenditure of energy, 
(c) relaxation, (d) self-expression, (e) communication, and (/) experi- 
mentation. The psychoanalysts have studied the dynamics of play in 
children intensively and find much evidence of catharsis, that is, an 
outlet for a libidinal urge accompanied by emotional satisfaction. 
Among clinicians play has frequently been used for diagnosis and 
concun*ent treatment, but little is as yet available which may be 
called standardized observation and measurement oi play. 

One advantage not found in most measuiement procedures is 
that play usually provides a natural situation in which a child or 
adult reveals his feelings, wishes, and fears rather freely. The great 
diversity of responses, however, has emphasized the uniqueness of a 
performance and has made its observation and standardization ex- 
tremely complicated. For accurate interpretation there must be norms 
and well-defined experimental controls. Unless one knows with con- 
siderable accuracy the degree to which a person varies from his group 
norms, any reliable appraisal or diagnosis is impossible. 

Among the most careful technical studies of play are those re- 
cently published by the Iowa Child Welfare Research Station, pre- 
pared under the leadership of Robert R. Sears. Bach (1945) made 
quantitative studies of fantasies in young children, measuring the 
amount and type of thematic play, the effect of environmental vari- 
ables, and the degree to which play themes reflected actual experi- 
ences. Phillips (1945) reported the effects of different amounts of 
realism in play objects, and of different lengths of play periods on 
aggression and '^tangentability*' (i.e., turning to other activities away 
from the experimental situation). Pintler (1945) studied the effects 
of different amounts of stimulation by the observer, and Robinson 
(1946) compared the effects of using dolls representing the child's 
own family with the effects of using dolls of a standard set. These 
studies were summarized in part by Pintler, Phillips, and Sears (1946) 
in a study of sex differences. The type of behavior most frequently 
found among children is summarized in Ulus. 181. The girls showed 
more stereotype themes, the boys had more nonhuman themes, more 
theme changes, and more nontangential aggression, and there were 
no reliable sex differences in exploratory or organizational activity 
and nonstereotyped thematic play. The authors believe that social 
learning in early childhood caused most of these differences. Al- 
though no standardized scales or individual profiles have come from 
these studies as yet, they lay the foundation for categories of behavior 
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ILhVS 181. BEHAVIOR CATEGORIES IN DOLL PLAY • 


Category 

Girls 

Bovs 

Exploratory 

26 3 

23 4 

Organizational 

35 1 

58 4 

Inappiopriaie Oiganizational 

2.7 

7.1 •* 

Stereot) ped-Thema tic 

71.1 

43 4** 

Self Thematic 

42 

41 

Nonhuman Thematic 

16 

75 •• 

Nonstereotyped Thematic 

32.3 

340 

Tangential Thematic 

326 

456 

Tangential Play 

16 8 

15 5 

Non tangential Aggression 

18 9 

304** 

Tangential Aggression 

53 

87 

Total Aggression 

24.2 

391 •* 

Number of Theme Changes 

4.5 

89** 


♦ Adapted from Pintler, Phillips, and Sears (1946), Journal of Psychology, 21, 
p. 77- 

*♦ Differences significant at or below the 5 per cent level. 

(By permission of the authors and the editors of the Journal of Psychology,) 

which can be reliably observed and recorded. The original reports 
should be consulted for specific definitions of the terms listed in lllus. 
181 and also for much more detailed information. 

A great deal of thoughtful use has been made of play therapy by 
clinicians in dealing with anxiety. Although therapy falls outside 
of the scope of this book, the therapists have contributed a number 
of basic concepts which are important in test development. For in- 
stance Erikson (1941) noted variations in psychosexual status and 
maturity in the play sphere or setting, such as play with small objects 
(microcosmic), play with life-sized objects (macrocosmic), and play 
with own body, fingers, voice, and sensations (autocosmic). He also 
described location determinants, such as up and down, backward or 
forward, left or right, open or closed. He placed special emphasis on 
play disruption, that is, on onset of inability to play, which may be 
sudden or slow. Lastly, he described psychoanalytical symbols which 
are usually condensed and abstracted in form and sublimated. Among 
older children and adults, studies of spontaneous play are rare, but 
such studies using careful observation and techniques might be very 
revealing. 


ROLE PLAYING 

Role playing has been given considemble impetus in this country 
by J. L. Moreno (1946), who used it principally for training in 
spontaneity and, when necessary, for therapy. Incidentally it is an 
important method of personality diagnosis. 
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Moreno and his students have developed what they call psycho- 
drama j a procedure for releasing emotions which uses a specially con« 
structed stage and a director who initiates, stimulates, and guides the 
action o£ both players and audience. They point out that a psycho- 
drama test is superior to other projective methods because it is an 
actual sample o£ behavior in a social setting with “real obstacles.” It 
reveals cultural levels as well as personality. Their psychodrama test 
consists of placing a person on the stage and observing and recording 
his performances under as many as nine test situations. The first re- 
quires the subject to imagine a person and to give time, place, identity, 
and characteristics. The director may introduce a wide range of 
themes, such as love, death, economic problems, and self-realization. 
The other situations are designed to give further information con- 
cerning the subject's goals, choice of methods, perception of themes 
already in progress, and rapid adjustment to changing situations. As 
yet the observing techniques and interpretative schemes are rather 
vague, but they are rapidly being improved. 

In industrial personnel work role playing has recently taken on 
significance, both in training of supervisors and in appraisal of candi- 
dates for employment (Chapter XXIV). 

MINIATURE STAGE SETTINGS 

Doiring the last ten years considerable attention has been given to 
the development of procedures in which a miniature stage or table 
top is used for analysis and therapy. All of these procedures record 
the progress and final results of manipulating objects and also en- 
courage oral explanations. 

The World Test 

Lowenfield (1939) experimented for ten years at the Institute for 
Child Psychology in London with a World Test, which furnished 
a wide variety of small objects and dolls to be constructed into a 
“world” on a flat space. Buhler and Kelly (1941) further developed 
the test and published materials and a manual for careful observa- 
tions. They placed on a table top 150 pieces representing houses, 
people, animals, trees, fences, cars, and other common objects and 
asked the subject to build whatever he would like Bolgar and Fischer 
(1947), using 232 pieces, have developed detailed scoring schemes 
and norms for 100 adults — 50 men and 50 women. They divide the 
scoring into six categories: (1) order of choice of first and subsequent 
pieces, (2) amount and variety of pieces and spaces, (3) Gestalt or 
configurations based on such needs as: practical, logical, social, vitah 
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or esthetic ones, (4) content (ideas and items used and rejected), (5) 
behavior or organizational activity, and (6) verbalization 

Michael and Buhler (1945), working with normal and abnormal 
adults, found indications of basic personality structure in the world 
structures made by individuals. Thus psychotics in general indicated 
the world as aggressive or threatening. Sex-problem cases and com- 
pulsive children constructed unpopulated worlds, indicating fear or 
hostility to people Mentally defective adults and alcoholics used 
less than one third of the items, indicating lack of imagination and 
interest. Psychopathic and compulsive personalities revealed anxie- 
ties by building fences or protective walls and closed spaces. Hyster- 
ical persons built disorganized and confused worlds, and obsessive- 
compulsive persons made rows or rigid patterns indicating deep- 
seated inhibitions. 

Erik Homberg (1938), a psychoanalyst, cooperated with Murray 
in the study of character formation of a group of college men. Each 
subject was brought into a room where a table was covered with 
small toys. These included toys readily identified as a father, mother, 
son, daughter, little girl, maid, policemen, farmers, animals, furni- 
ture, autos, blocks, and walls. The observer stated that he was inter- 
ested in ideas for moving pictures and wished the subject to use the 
toys to construct a dramatic scene. The observer left the room 15 
minutes while the subject believed he was unobserved. Actually his 
actions were watched and recorded through a one-way screen. Then 
the observer reentered the room and wrote down the subject's ex- 
planations and sketched the scene Of 22 subjects, 5 failed to produce 
a dramatic scene, 13 produced auto accidents or arrangements which 
prevented an accident, 9 made the little girl the object of danger. In 
other parts of these scenes 7 females were kidnaped, or bitten, or 
fainted, or died. The little girl was also handled or “run over" by a 
car in many of the preliminary situations. No male figures were ever 
in danger. The dog was the victim of an accident in the scene con- 
structed by a masculine and socially adapted person who said it w’as 
the little girl's dog and later, that women are faithful, they are dogs. 
The red racer had accidents in scenes by two persons who were near- 
est to manifest homosexuality and manifest psychosis For a similar 
group of five college women the most frequent scene showed a 
criminal man who deserts, neglects, or murders his family, or strangles 
his wife, or steals, or tries to do these things but is prevented. 

Homberg reported that much of the behavior was indicative of 
personality structure and that when asked for a dramatic production, 
most of the subjects produced symbolic traumatic tensions of their 
own. In offering little toys for a dramatic task, he provoked a return 
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of infantile conflict, and the subjects seemed to have continued from 
where they left off in childhood. The main recurring themes may be 
partly reflections of newspaper and movie drama, but also seem to 
reflect convincingly the personal sexual needs among unmarried 
young adults. The method, as Homberg uses it, certainly brings out 
much material which, like the TAT or Rorschach results, may be 
analyzed and quantified. 

Make-a-Picture Story (MAPS) Test 

E. S. Shneidman (1948) described the Make-A-Picture Story Test 
in which the subject is allowed to place sixty-seven small cardboard 
figures on a table, and then make a picture and tell a story about it, 
using a background card. The twenty-two backgrounds (8%- by 
1 1-inch achromatic pictures) include: 


living room 

doorway 

street scene 

cellar 

medical scene 

landscape 

bathroom 

cave 

dream 

raft 

bridge 

attic 

bedroom 

shanty 

blank 

cemetery 

forest 

nursery 

closet 

schoolroom 

camp 

stage 


Some of these backgrounds are unstructured or ambiguous, as the 
blank and doorway; some are semi-structured, as the forest and cave; 
and 15 pictures are definitely structured. A wide enough variety of 
backgrounds is included to touch nearly all the problem areas found 
in clinical cases. The 67 figures, listed in Ulus. 182, are 9 male white 
adults, 1 1 female white adults, 12 children, 10 minority-group figures, 
6 legendary figures, 5 silhouette figures with blank faces. Most of 
these are standing and clothed, but some are partly clothed or nude. 
The tallest human figure is 5% inches. The numbers in Ulus. 182 
are a code for rapid recording of results. 

The examiner presents one background at a time, asking the 
subject to ‘^select one or more of the figures, put them on the back- 
ground and tell a story about who the characters are, what they are 
doing and thinking and how they feel, and how the whole thing 
turns out.” The number of backgrounds used depends somewhat 
upon the subject. If possible the first ten in the list are presented, 
and then the subject is allowed to choose from among the rest. 
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ILLUS. 182 MAKE-A-PICTURE STORY (MAPS) TEST • 

Code Subject 

MAIJF ADULT 

- 1 Nude male; rear view 

- 2 Man undressing 

- 3 Soldier standing at attention 

- 4 Military figure; right hand pointing down 

- 5 Policeman 

- 6 Supine figure with blood spots 

- 8 Pnestlike, in long robe 

- 9 Man with brief case; coat over arm 

- 10 Man carrying baseball bat and box 

- 11 Man with fist raised 

- 12 Man with both hands on left cheek 

- 13 Man with both hands folded m front of him, looking down 

- 14 Man with polka dot necktie, c>ebrows raised 

- 15 Man with right hand in pants pocket 

- 16 Older man with mustache, dressing gown, left fist raised 

“17 Rear view of man on haunches looking at picture 

- 18 Cripple, man on crutches 

“ 19 Figure with back of right hand on hip; left arm extended; 

possibly effeminate 

FEM-ALE ADULT 

Nude female 
Female undressing 
Woman both hands on left thigh 
Rear view; dress tom at left 
Both hands to mouth 
Bending over; arms up, apron 
Eyes wide open; eyebrows raised 
Lrft hand up; right hand holding booklike object 
Woman, right hand to right ear 
Old lady with shawl 

Young woman in defensive position; left elbow in air 

INDEITRMINATE \S TO SEX 

I “ 1 Supine figure in slacks or pants, left hand on belt 

1-2 Rear view of seated figure, head resting on left arm 

CHILDREN 

C - 1 Sad girl, hands behind back 

C “ 2 Girl; hands folded on dress 

C - 3 Girl with ribbon m hair 

C “ 4 Girl; rear view running 

C - 5 Nude girl 

C - 6 Nude boy 

C “ 7 Boy; rear view walking 

C - 8 Boy with left hand to eye 

C - 9 Boy with left fist raised 

G “ 10 Boy; both arms outstretched; bandage on left leg 

C - 11 Boy; hands on chest; looking up 

C “ 12 Little boy; right hand extended 


F-1 

F-2 

F“3 

F-4 

F-5 

F“6 

F-7 

F-8 

F-9 

F-iO 

F-11 
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ILLUS. 182. MAKE-A-PICTURE STORY (MAPS) TEST ♦ (Cont^d) 

Code Subject 

LEGENDARY AND FICTITIOUS 

L~ 1 King in 16th century costume 

L-2 Pirate 

L-3 Santa Claus 

L - 4 Ghost 

L--5 “Futurenian” Rith cape and tights 

L-6 Witch, ugly old woman with tongue out 

ANIMAL 

A - 1 Cocker spaniel pup 

A - 2 Snake 

MINORITY CROUPS 

Old Negro man; patched clothes 
Mammy-type Negress; hcadkerchief 
Negro man reading paper 
Negress in business suit 
Negro zoot-suiter; with knife 
Negress in white dress and shoes 
Pious Jew; beard and skull cap 
Merchant Jew; w'earing vest 
Latin-American female, bracelets on left arm 
Oriental female; kimono 

SILHOUETTE AND BLANK FACES 

S « 1 Solid black male silhouette 

S - 2 Man with blank face 

S -- 3 Woman with blank face 

S - 4 Boy with blank face 

S-5 Girl with blank face 

• List of figures from Shneidman (1948), p. 169. 

(By permission of the author and the editor of Genetic Psychology Monographs ) 

The picture results are recorded on location charts and the stories 
are taken down as nearly verbatim as possible. The quantitative re- 
sults include: 

1. The number o£ figures used on each background and the 
average number of figures for all backgrounds used. 

2. The number of times the same figure is used and the number 
of times a type of figure, such as legendary, is used. 

3. The placement of the figure, such as walking, floating, prone, 
outside background, and on top of another figure. 

4. The number of times a specific figure and a type of figure are 
used with a specific background. 

5. Number and type of figure interaction. 

6. Number of times figures are described in a particular activity, 
such as sightseeing, cleaning up, eating, and murdering. 


N-i 

N-2 

N-3 

N-.4 

N-5 

N-6 

N-7 

N-8 

N-9 

N-10 
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7. Number of times figure represents a specific person — self, 
others, no one, etc. 

8. Time at which a figure is brought into the action — before the 
story began, as story unfolds, after story is completed, etc. 

9. Use of backgrounds — ignored, rejected, two used for one story, 
used for mood, etc, 

10. Time in seconds to placement of first figure, to the beginning 
of story, and to the end. 

In a careful analysis of the results of applying this test to fifty 
normal males convalescing in a veterans* hospital, and to fifty schizo- 
phrenic patients, Shneidman reported 64 “signs'* that differentiated 
the psychotic from the normal groups. Of these signs 42 were found 
more frequently in the normal group and 22 in the psychotic. An 
' individual's score was then determined by subtracting the number of 
psychotic signs from the number of normal signs in his record. These 
scores alone showed marked and reliable differences between the nor- 
mal and the psychotic groups. Further analyses of the results yielded 
qualitative indicators of the most usual schizophrenic trends ^in- 
dividuality of response, self-identification, social isolation, overinclu- 
sion, inappropriateness, symbolization, desire for environmental 
simplification, inhibition of fantasied violence, punitive conscience, 
lack of identification with normal masculine role, religiosity, and 
debasement of women. 

Shneidman also points out that the MAPS Test may be of con- 
siderable value as a tool for studying prejudice, the psychology of 
minority groups, improvement during treatment, readiness for treat- 
ment, as a supplement to psychodrama, and as a therapeutic device. 

Three-Dimensional Apperception Test 

Doris Twitchell-Allen (1946) developed an interesting variation of 
a miniature dramatic technique which employs twenty-eight small 
plastic figures (Ulus. 183) The figures include simple rectangular and 
curved polygons to represent well-established gestalts, and one vague 
human figurej and the rest of the figures are purposely made vague in 
order to represent symbols and to elicit fantasy The material is used 
in a naming test and for storytelling or dramatic action. A 6-page 
summary booklet allows space for the recording of bodi first re- 
sponses and the replies given during a period of inquiry. Tentative 
norms are available for children, adolescents, and adults, with re- 
spect to usual and bizarre associations, modes of organization, and 
the use of objects in patterns. This test has some of the advantage 
that the Rorschach has in its use of certain relatively unstructured 
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vague patterns into which the subject may project concepts and feel- 
ings. 

ILLUS. 183. TWITCHELL-ALLEN THREE-DIMENSIONAL 
APPERCEPTION TEST 



(By permission of Doris Twitchell-Allen and the Psychological Corporation.) 


SUMMARY 

The brief description of dramatic and play techniques given 
above does not do them justice, and may appear to oversimplify their 
administration and interpretation. At present none of the tests can 
be easily or quickly applied or interpreted, even by experienced ex- 
aminers, Although the progress with these tests is to date consider- 
able, controlled experimental variation with well-defined groups is 
badly needed. There are good observation techniques and some 
norms for social play among children. Studies have been made of 
racial attitudes, maladjusted children, social groups, and candidates 
for employment. Role playing is being widely used in school, in in- 
dustry, and in military establishments. Miniature-drama techniques 
have been developed and used for diagnosis and therapy among chil- 
dren adults, They are on the way toward adequate norms for in- 
terpretation. 

STUDY GUIDE QUESTIONS 

1. What are the theoretical advantages and disadvantages of using play 
situations for appraising needs and drives? 

2. Compare the two main categories used to describe play with those of 
the TAT test. 

3. What sorts of basic personality structure are shown by evidence takeh 
from the results of the World Test? 
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4. What aspects of the MAPS Test make it of value in projective testing? 

5. What aspects of personality are reflected most accurately by natural 
play and role playing, and by using miniatures in play? 

6. How may significant group differences be used in determining personal 
characteristics? 



CHAPTER XX 


INTERESTS 




This chapter begins with a short analysis of the nature of motiva- 
tion, Following this are illustrations of five types of evaluation of 
interest: case histories, logs, reasons for choices, special knowledge, 
and inventories. A discussion of some practical results is followed by 
a comparison of methods and a discussion of needed research. 

THE NATURE OF MOTIVATION 

Motivation is notoriously difficult to define, partly because of the 
complexity of the behavior involved and partly because of wide- 
spread unanalytical ways of thinking. Thinking in this field can often 
be made more accurate by the use of operational definitions^ which 
always refer to particular acts in a particular situation For example, 
in physics a force of one dyne is defined as the force which moves a 
mass of one gram a distance of one centimeter in a particular region 
of the earth’s surface. All other forces can be measured by comparing 
them with one dyne. Furthermore, work is defined as the total force 
exerted during a given period of time. Forces also have direction, in- 
dicated by the angle between an arbitrary plane or base line and the 
line of trend. 

In the appraisal of human behavior, a motive or an interest is simi- 
larly defined as that which moves a person or part of a person in a 
particular direction when he is in a given situation, yin order to meas- 
ure a motive one must define its direction and give quantitative values 
to its strength. This can be done by plaang a person in a standard 
situation and securing indications of the amounts of work he does 
to reach certain goals. The work or energy expended seems to depend 
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on two independent factors called drive and incentive. Drive is de* 
fined as internal tension, mental or physical, and incentive is the ex- 
ternal goal, the achievement of which relieves a particular tension. 
For instance, a drive would be a dryness in the throat, and the incen- 
tive the glass of water, or the memory of how to get a glass of water. 

The most careful workers try to evaluate separately the effects of 
drive and incentive, because two persons may show the same amount 
of goal-seeking for different reasons. One person may have been de- 
prived of water for 48 hours and hence have an intense drive. An- 
other may have had plenty of water a half-hour earlier, but has an 
intense liking for the particular fluid that is being offered. 

For the most useful predictions it is better to secure scoies which 
represent known quantities of particular patterns of behavior, rather 
than an unknown combination of several patterns. This can be done 
in a laboratory, either by holding the incentive constant and meas- 
uring the strength of the drive or by holding the drive constant and 
measuring the effect of the incentive. 

In usual testing or classroom situations, however, it is not easy to 
control either drives or incentives. In most of the appraisals of inter- 
est described in this chapter, no attempt is made to distinguish be- 
tween drives and incentives. The projective techniques have made 
some progress in the appraisal of strength of drives (Chapters XVII, 
XVIII, XIX, and XXIII). 

In his analyses of motivation, Thorndike (1935) suggests that un- 
learned likes or dislikes should be distinguished from learned. He 
lists tentatively, without devising a scale, the sorts of unlearned drives 
which should be included — ^preferences for tastes, smells, bodily 
temperature, muscular activity and rest, courtship and love, mother- 
hood, receiving favorable attention, successful competition, familiar 
surroundings and changes of scenery, and mental activity. Thorndike 
believed that an inventory of unlearned likes would probably not 
contribute as much to adult education as to the planning of early 
education Items of this kind are included in almost all studies of 
interest, aiiiuule, and adjustment leportcd here 

Iiitciests of chilchen otten seem to be motisated by desires for 
security, lor domination, for escape from arbitrary limitations of 
home or school, and lor umestricted freedom and adventure Such 
interests often have little relation to vocational abilities and oppor- 
tunities Adolesccnrs* interests usually undeigo changes concomitant 
with the realization of abilities and limitations and the establish- 
ment of more mature ideals Adults are to some degice motivated by 
escape, but they also have strong desires to build up and maintain 
health, family, and financial security. Older persons usually show 
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strong interests in activities which provide security or escape from 
limitations imposed by age. 

Interests Defined 

There seems to be no standard or widely used definition of inter- 
est. Interests are concerned with enjoyment of or displeasure in past, 
present, or future experiences. They are acceptance reactions to all 
kinds of incentives both real and imagined. 

Two kinds of interests are often pointed out. One, called intrinsic 
or primary interest, is shown when a person does what he likes to 
do. He may sing a song, dance or work with numbers or with tools 
for the fun of it. Here the satisfaction is immediate and the activity 
is an 6nd in itself. The other kind of interest, called extrinsic or 
secondary, is shown when a person does a certain kind of work be- 
cause he believes it may bring him wealth or social satisfactions most 
quickly. There may be little or no satisfaction in the activity itself 
but the reward which is expected to come later is desired. Such inter- 
ests are remote, and the activity is a means to an end. 

Uses of Measures of Interest 

A survey of measures of motivation shows two important uses. 
First, educators and philosophers have emphasized for many centuries 
that one of the principal goals of human development is a strong and 
well-balanced set of interests. Recently the Commission on the Rela- 
tion of School and College of the Progressive Education Association 
indicated as one of the major goals of education, 

. . , that interests should be developed in each major area of living. These 
may be classified broadly as economic interests, civic interests, interests cen- 
tering in the home, and recreational interests (Smith and Tyler, 1942, p, 317) 

In order to gauge one's progress in developing worthwhile inter- 
ests, accurate evaluating tools are necessary. The same tools could be 
used in determining the most effective ways of developing interests. 

The second use of measures of motivation is almost the reverse of 
the first. Here appraisals of interest are used to help in choosing a 
vocation, on the assumption that present interest is a fair predictor 
of future success. This is typically a counseling and employment ap- 
proach. To date most of the studies emphasize vocaUonal or economic 
interest patterns. 


METHODS OF EVALUATION 

Cattell, Heist, and Stewart (1950) have described twenty-five pos- 
sible approaches to the measurement of interests, most of which 



INTERESTS 


S51 


have been used only in a few laboratory situations. A list of this 
type points to the need of a large amount of research, and to the 
possibility of the use of many more objective indicators of interests 
than are in use today. The list is summarized here 

1 Money amounts and per cents of money spent on certain activities. 

2 T ime per cent of time given to certain courses of action 

3. Opmionaire self-appraisals of typical behavior. 

4 Preferences, choices between courses of action, real or described 

5 Attention time* spontaneous attention to \arious stimuli. 

6 Immediate memory amount of material recalled soon after an ex- 
perience 

7 Reminiscence' amounts of material recalled spontaneously after vari- 
ous periods of time. 

8 Distraction, failure to perceive surrounding material when an inter- 
esting object is present Narrowing an attention area 

9. Retroactive inhibition acceleration of forgetting due to subsequent 
learning of more interesting material. 

10 Information amounts of facts related to a course of action in which 
one IS interested. 

1 1 Speed of decision time needed to make a decision or to reject or accept 
a statement. 

12. Level of skill* amounts of skilled activity m a field of interest as skill 
in playing the piano may indicate interest in music. 

13 Misperception errors in perception in the direction of an interest or 
failure to notice errors due to a distracting interest. 

14 False belief the amount of distortion of factual statements in support- 
ing belief or interest 

15 Fantasy, the time spent in spontaneous fantasying or the choice 
of fantasy reading where alternate es are presented 

16. Projection two types of measures are noted, (a) A picture or verbal 
statement of an activity is presented and the subject selects the best ex- 
planation of the behavior, and (b) the subject chooses from a list those 
activities for which he prefers to explain the motive 

17 Ego defense dynamisms interests connected with ego conflicts may 
be appraised from a defense dynamism, such as reaction-formation, 
identification, and rationalization. 

18 Fluency the relative amounts written or spoken concerning various 
activities. 

19 Speed of reading the rapidity with which material is read is thought 
to indicate interest or the degree to which the subject agrees with the 
statement. Difficulty must be controlled. 

20 Work-endurance measures- the amounts of effort or discomfort en- 
dured to obtain a desired end. 

21. Psychogalvanic response changes in skin resistance for various activi- 
ties. 

22. Pulse rate- changes in pulse rate during various activities. 
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23- Metabolic rate: changes in metabolic rate during various activities 

24 Muscle tension: changes in general muscle tension during various 
activities. 

25. Writing pressure: the amount of pressure exerted in writing answers 
to questions related to various acthities. 

In discussing these methods Cattell pointed out that the first two 
result from observations or free interaction over periods of consid- 
erable time. The next two are introspective self-assessments. Methods 
5 through 9 are tests involving immediate attention and memory, and 
10, 11, and 12 are tests of more permanent learning effects. Methods 
13, through 17 detect distortions of perception and belief, and 
amounts of wishful thinking or fantasy, often in a frustrating situa- 
tion. Methods 18, 19, and 20 relate to output of energy or endurance 
of discomfort, and the last five methods seek measures of physiological 
changes. 

The five commonly used methods for evaluating interests are: (a) 
case histories or autobiographies, (b) reasons for vocational choices, 
(r) logs, {d) measuies of special knowledge, and (<?) inventories. These 
are illustrated in considerable detail below Case histories are usu- 
ally the best method of revealing a long-time trend and conflicting 
forces. Observations are most accurate for showing the relations be- 
tween drives and incentives. Tests of special knowledge are thought 
to be closely related to interest when ability and opportunity are 
held constant. Vocational choices show individual practical judg- 
ments, and inventories are good for wide sampling of many interests 
in a way which makes comparisons of individuals meaningful. 

Case Histories 

The following case histories illustrate the complexity of factors 
which determine vocational interests (Fryer, 1931). 

SUBJECT 33: male, education, two years of college; age, about fifty years 

As I think back, it seems to me that I have just gone from one relationship 
into another relationship, each growing naturally out of the previous sit- 
uation. I left college when I was preparing to be a surgeon. Because of 
family objections to a choice of that vocation, I went into business without 
any particular choice of vocation I began at about twenty years of age and 
continued in business until I was about thirty-two, making one change of 
vocation which was that of salesman; finally, I became an expert in style 
patterns. 

At the age of thirty-two a secretary of the YMCA was foolish enough to see 
some qualifications in me for the religious work of the Association and ex- 
tended to me a call. The determining factor m that change was the facing 
up to all that would be involved in refusing it, and the consequent convic- 
tion of moral and spiritual cowardice This was a period of real crisis, and 
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for two months and a half, I studied the situation and wrestled with the 
problem. The final decision was reached on as clear a venture of faith as 
anything I have ever undertaken, and speaking very honestly, I feel that 
the experiences of the past eighteen years have shown the wisdom of that 
venture in faith. I know that it was a great personal good to me to change 
from the interest in things and selling tilings to the interest in men and 
the putting across of ideas and ideals The mental reorganization that was 
necessary, compelling me to go through about five years of hard study, gave 
me a new mental freedom and developed within me a capacity of v?hich I 
was not aware The settling of questions on the basis of deep inner convic- 
tions has almost become a habit with me and that is why it is now difficult 
to think back to the period w»hen very many choices were dictated by chance 
and opportunism. 

Subject 33 gives an account of a life that began on a casual and 
opportunistic basis. The vocation of surgeon is relinquished and 
business accepted without any definite, powerful interests being in- 
volved. The branch of business likewise seems not to have been very 
important. The basis for the great change after thirty, into an oc- 
cupation of social service, is not clearly given but the appeal ap- 
pears to have been unusually powerful. The resulting success in the 
new work, following success in the business world, suggests that more 
of the same qualities are involved than at first sight appears to be the 
case. However, a very definite shift in interest is obviously required 
to bring the salesman’s capacities into the service of the YMCA secre- 
tary. 

SUBJECT 34’ female; education, A.B. degree, age, about twenty-five years 
High School, from age fifteen to eighteen and one-half years 

Early in high school, possibly during first year, I became ambitious to 
become a nurse. This ambition persisted and increased in intensity of 
desire until I was about twenty years of age and in the second year of univer- 
sity Reasons for change* (1) marked opposition to nursing profession by 
my parents, (2) unexpected opportunity to complete university course; (3) 
gave up idea of becoming public-health nurse because I flunked chemistry, 
a requisite course at this time. 

University Life, from age nineteen to twenty-three years 

Early in college my aspirations to become a nurse waned, and I aspired to 
higher goals I next went through a period of intense desire to become a 
writer, colored partly by the fact that I had for instructor in English a poet 
who took a personal interest in my ambitions This ambition I still harbor, 
but without any very definite ideas or plans for the future. 

At about twenty-one or twenty-two years of age, I decided that I would 
like above all else to become a physician. This was impossible because of 
economic pressure so I went into sociology for the last two years of college, 
specializing in hygiene and medical social work. I always held very definitely 
to the idea of doing medical social work, and became part-time worker in 
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a clinic during the last year of college. At this tune (twenty-three years) I 
was in the neuro-psychiatry department of the clinic, and became definitely 
interested in that particular field. This interest still persists and I still have 
faint hopes of studying medicine and specializing in psychiatry. It was 
because of this interest that I accepted my present position, so that I could 
be in close contact witli psychiatric work. 

I have always fostered a desire to become a dancer to such a degree that 
it has become an avocation with me. However, I feel it might have as well 
become a vocation, were it not for my family prejudices and objections 
early in my life, which prevented my e\er taking dancing lessons. 

Vocational interests mentioned by Subject 34 show a definite se- 
quence from the original interest in nursing, through medicine, 
medical social work, to psychology. Outside of this line of develop- 
ment, which has led to work related to the work of a psychiatrist, are 
the two esthetic interests, one in writing and a stronger one in danc- 
ing. The influence of family prejudice is shown in checking the nurs- 
ing and the dancing as vocations, while the positive influence of 
others is shown in the writing and in the psychiatric work. 

SUBJECT 36: woman; age, slightly over forty years 

She had to go to work when she graduated at thirteen from elementary 
school. Before this she had had aspirations to be a teacher or a nurse. She 
was a clerk for a number of years, after which she did war work. Further 
office work was followed by positions in a hospital in which she took his- 
tories, passed on admissions, etc. Difficulties with a superior finally led to 
her leaving this position, which she had held for several years Since that 
time she has been unsuccessfully seeking work over a period of months. In 
the fall she secured a part-time selling job m a department store, and was 
kept on after the Christmas rush. She does not like selling but she loves 
keeping records She thought she would like medical social work, anything 
to do with hospitals or medicine. 

On the Stanford-Binet she has a mental age of 16 years. On the Stenquist 
she made a score less than that for the twelve-year-old median, but in spite 
of her lack of ability to do the problems she worked steadily without com- 
plaints and tried practically all of them She has a quiet, appealing personal- 
ity ani^daims to be very patient and hard-working. Her situation is such that 
she ratist undertake whatever is open to her, and the probability appears to 
be that she will be faithful and industrious m anything she tries. 

Reasons for Choices 

These cases from Fryer show that educational and vocational 
choices resulted from a fairly large number of causes which were 
often in conflict with one another. Goals may change rapidly with 
maturation of abilities, and also with the acquisition of information > 
and with emotional conditioning. The causes for vocational choices, 
therefore, deserve special study. A thorough study of this field would' 
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involve all the dynamics of behavior. A number of special studies 
limited to academic and vocational situations are all that can be in- 
cluded here 

Choice of a vocation must be distinguished from interest or satis- 
faction. Williamson (1937) states that neaidy 40 pei cent of men and 
46 per cent of women do not choose the occupations they prefer. 
Strong (1945) found that 36 per cent of Stanford seniois gave no oc- 
cupational choice or were not sure what they were going to do. Some 
had strong negative choices and others were unwilling to admit a 
low-level choice 

Reasons for vocational preferences may be inferred to some extent 
from the preferences of various age groups The vocational choices 
of 640 boys and girls from eight to seventeen vcais of age in Topeka, 
Kansas, were reported by Lehman and Witty (1927). Each person 
was asked to inclicate which three occupations in a list of two hun- 
dred he liked best Marked changes with age were found, thus, among 
girls, 5 per cent of the eight-year-olds and 32 per cent of the sixteen- 
year-olds chose stenography. In the same group, movie actress vras 
named by 20 per cent of the eight-year-olds and 3 per cent of the 
sixteen-year-olds. Among boys the percentages increased with age 
for the professions and decreased for such occupations as cowboy, 
auto racer, and president of the United States. 

Fryer (1931) reported these same tendencies in a study of 181 early 
adolescents. The clioices tended to change from exciting, romantic, 
artistic, or frontier occupations to those which seemed more prac- 
tical. This shift of choices is probably not a true shift of likes or en- 
joyment, but is due rather to a glowing realization that noimal ex- 
istence requires a considerable amount of monotonous, if not un- 
pleasant, work. 

A number of reports which are available show marked tendencies 
toward impractical choices, even among late adolescents and adults. 
Loomis’ report (1949) is typical of many others. Among about three 
thousand Michigan high school senior boys he found 40 per cent 
had aspirations for professional work, 25 per cent really expected 
to succeed in professional work. According to the 1940 census, how- 
ever, only 16 per cent with twelfth grade or more education would 
be able to secure professional work. Illustration 184 compares the 
aspirations, expectancy, and real chances of employment for such 
boys in seven areas of work unskilled, semi-skilled, skilled, farmer, 
clerical and sales, manager and proprietor, and professional 

Another approach to the evaluation of reasons for choices is that 
of direct questioning Vernon (1938) derived from interviews with 
forty-seven university women students the following list of main 
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drives: social conformity, humanism, activity, independence, se- 
curity, ease, inferiority, power, and social admiration. Activity and 
independence were named most frequently. Different drives were 
sometimes found to lead to the choice of similar careers, and similar 
drives to lead to the choice of different careers. Know^ledge of the 
interaction of all the factors of a situation was essential for a com- 
plete understanding of the course of action. 

Anderson (1934) required 673 college men to rank twenty-five 
occupations in order of their (a) contribution to social well-being in 
general, (b) contribution to one's social prestige, and (c) probable 
economic return. The numerical ranks for each occupation based on 
group medians are shown in Ulus. 185. There is a marked relation- 
ship between all three values. Correlations between rankings for 
social contribution and social prestige made by agricultural, engi- 
neering, business, and textile students were all approximately .82; 
between social contribution and economic return, 72; and between 



ILLUS. 186 OCCUPATIONAL VALUES 



Social 

Median Ranking 
Social 

Economic 


CotUri- 

Prestige 

Return 

Clergyman • . 

hiUion 

29 

48 

14.5 

Physician . • • 

32 

4.7 

49 

Professor . . , 

4.6 

6.6 

109 

Banker .... 

6.1 

3.1 

29 

School teacher . 

64 

11.6 

17 3 

Manufacturer . 

76 

6.7 

30 

Lawyer .... 

78 

6.1 

58 

Farmer .... 

83 

144 

13.2 

Engmeer . . . 

8.4 

9.4 

6.4 

Artist .... 

86 

70 

89 

Merchant . . . 

11.0 

11.7 

10 5 

Factory manager 

117 

11.4 

8.6 

Machinist . . . 

15 6 

17.2 

12.9 

Carpenter . . . 

15.6 

18.8 

15.2 

Bookkeeper . . 

15 8 

15 6 

17.0 

Insurance agent . 

163 

15.4 

13.9 

Salesman . 

170 

15.5 

140 

Factory operative 

18 3 

212 

19 8 

Barber .... 

..... 189 

201 

19.9 

Blacksmith . . 

19,6 

21.7 

20.1 

Baseball player . 

19.8 

142 

8.9 

Soldier .... 

20.7 

217 

23.7 

Chauffeur . . • 

230 

23.1 

22 5 

Man of leisure . 

238 

73 

14.4 

Ditch digger . . 

24.9 

25.5 

24.6 


(Anderson, 1 934, p. 443. By permission of the Editor, Journal oj Social Psychology.) 
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prestige and economic return, ,86. The rankings by various classes in 
college were very similar, and two classes three years apart also gave 
almost identical rankings. Anderson concludes that these occupa- 
tional values are widely and uniformly held and play an important 
part in the selection of one's vocation. 

A study by Greene (1938) of college sophomores showed that 81 per 
cent of 278 men and 64 per cent of 320 women had made fairly defi- 
nite choices. The main reasons given for these choices are shown in 
Illus. 186. From these results it appears that men were more fre- 

ILLUS. 186. REVSONS FOR VOCATIONAL CHOICES OF 
COLLEGE SOPHOMORES 

N = 598 

Reason Given 

1. Opportunity for employment 

2. Opportunity for training 

3. Initial income 

4. Expectation of advancement 

5 Desire to be of service 

6. Pleasantness of the work 

7. Ability as tested or demonstrated 

8. Family tradition 

9. Location, part of countiy 

10. Health 

(Greene, 1938 ) 

quently influenced by considerations of advancement than women, 
and that women more often than men considered opportunity fox 
employment and service to others as sufficient reason for choice. Per- 
sons who had chosen teaching or other professions usually gave ‘‘serv- 
ice to others" as the mam reason for their choice. The fact that most 
persons had not considered "ability as tested or demonstrated” or 
"enjoyment of the work” of major importance in choosing a career 
is challenging. In the long run these two factors are probably impor- 
tant for vocational adjustment. It points to a need for more accurate 
self-appraisal. 

Logs 

Direct observation of behavior is most effective in a laboratory 
where conditions are well controlled, but it is also widely applicable 
in iAdustry, school, home, and recreational situations Direct observa- 
tion either results in a tabulation, or a log of actual activities, or a 
rating which attempts to summarize these activities. 

Logs can be made by self or others. One check of interests is a 
Reading Record reported by Smith and Tyler (1942). More than a 


Per Cent 


Men 

1 

2 

3 

54 

18 

5 

5 

9 

2 

1 


Women 

15 

0 

6 

13 

41 

9 

10 

4 

0 

2 
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thousand high school students entered their unassigned or voluntary 
reading on a record each morning for two weeks. Latei a shoi ter form 
was prepared to be made out once a week and to include a lating of 
how well the book or article was liked. These records yielded indica- 
tion of amount, variety, special empluisis, and maturity, Similai check 
lists of radio and motion pictine progiams yielded measuies of ex- 
tent, character, expeiiemes, and degree of satisiaction Students wcie 
found to be listening to the radio about 2 hours a clay — far more 
time than was spent in voluntary leading. Records of radio listening 
seem to be one of the most valid and reliable indicatois of interest 
in music, diaina, cunciit afiaiis, and social prc^blcnis. They bung 
out clearly interests that aie undesnable or a vsaste ol time The data 
gathered by activity iccoids arc illustrated by the lollowiiig sum- 
mary. 

Elizabeth read 15 books during the year Fiction included Mary John- 
ston's To Have and To Hold, C.hurchili's The Cims, The Piince and the 
Pauper, liertita Haiding’s Farexucll *Toinelte, and Lei the Hut name Roaj, 
two college stoiies, lion Duke and College in Crinoline, one dog story. The 
Count of Monte Cmto, The Gnl of the Limberlosi, Anne of Green Gables 
Nonfiction included The Boy’s Life o[ JVill Rogeis, Life with Mother, Men 
Aie Like Street Cais, and Daily Except Sundays Fight of these books were 
read during the sinnincr and seven during the school year The class of 
students of which Elizabeth is a member read an average of 12 books during 
the summer and 24 books during the school )ear Slic did not read books of 
as great difficulty and maturity as did the group as a whole Tlic fiction she 
read is distributed over Levels HI (eg, The Cruts), 11 (eg, Jock the Scot), 
and I (Girl of the Limbeilost), w'hcreas the median maturity level of the 
fiction read by the group as a whole is IV 

In Octoliei I9H8 Elizabeth checked Neiv Yoikei as the only magazine 
she read regularlv, in March 1939, lAfe, In October she was reading no 
magazine completely, in March, two — Life and Look She was below the 
class median in the number of magazines read regularly and the number 
read completely This evidence, together with tlie number of books which 
she read, suggests that she does not like to read to an extent comparable 
with other students in her group 

Elizabeth far exceecl(‘cl most of the members of her class in the number 
of motion pictures which she attended Site recorded seeing 39 during the 
summer and 86 during the school year 1 he median number of motion pic- 
tures ai tended by students of her class during the school year was 27, the 
range, 0 to 99 Also, she saw many of these 86 different motion pictures 
more than once Evidently, then, a large amount of her leisuie time was 
spent m viewing motion pictures During the year, Fli/abetli saw two plavs. 
The Boys from Syracuse and Abe Lincoln in Illinois, and attended a per- 
formance of The Mikado, The medium number of plays, operas, and con- 
certs attended by students in her class, however, w'as five. 
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Elizabeth’s five fa\orite radio programs in December 1938, were Benny 
Goodman, Bob Crosby, Kay Kyser, Make Believe Ballroom, and Tommy 
Dorsey. Of the 19 programs which she checked as the ones she listened to 
regularly, seven were dance orchestras sucii as the ones listed as favorites 
In addition to dance music, she listened regularly to five variety programs, 
three question-and-answer programs, two dramatic programs — Big Town 
and Lux Radio Theatre — and to AValter Winchell and Jimmie Fiddler 
Elizabeth was approximately at the median of her class in the number of 
programs she heard regularly ^ 

This account shows that Elizabeth’s leisure was largely spent on 
motion pictures and radio of a popular sort, and that her reading 
was limited in amount and maturity. 

Special Knowledge 

A fairly large number of wwkers have pointed out that people 
often enjoy and hence are interested in what they do well. A good 
talker likes to talk and a good electrician likes to make fine mstalla- 
tions and repairs. From this it follows that measures of skill or special 
knoxvledge may be good indicators of interest — better perhaps than 
inventories, wdiich may reflect desire for escape rather than an in- 
trinsic satisfaction. The relation between special knowledge and in- 
terest needs a good deal of investigation, however, because knowledge 
results from several factors — opportunity to learn, learning ability, 
good emotional balance, industry and health. Few careful analytical 
studies are at hand, although analytical tests of knowledge have been 
developed by the United States Army Air Force Testing Division and 
the Cooperative Test Service, and in the Michigan Vocabulary Pro- 
file. Wesley, Corey, and Stewart (1950) reported a comparison be- 
tween Kuder Preference Record scores and various measures of 
ability, for example, The Iowa High School Content Examinations 
for Science and English Literature, The Stanford Arithmetic Test, 
The Meier Art Judgment Test, The Seashore Measures of Musical 
Talent, and The Minnesota Test for Clerical Workers. 

Intra-individual correlations for each of 156 male college students 
were computed using three procedures. In one, in which the individ- 
ual’s deviations from group means were used, the correlations aver-v^, 
aged .30; in another, in which his deviations from his own means were 
used, the correlations averaged .42; and in the third, in which a rank 
order correlation was used, the correlations averaged .46 The lowest 
correlation — ^between musical interest and tests of tonal memory and 
discrimination — was about .23. The correlation between preferences 

1 From E R Smith and Ralph Tyler, Appraising and Recording Student Prog- 
ress, pp 334-35 Harper & Bros , 1942. 
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and measures in the artistic, scientific, and clerical fields was about 
.33. A correlation o£ nearly .50 was found in the mechanical and com- 
putational fields, and of .68 in the literary field. These figures may 
mean that interests are little related to ability to perform in art and 
music, where appreciation rather than performance is often the 
chief satisfaction. Additional research is needed to determine rela- 
tionships between knowledge and interest in the clerical, artistic, and 
musical fields. 

Inventories of Preferences 

In evaluating interests most authors and counselors feel that prac- 
tical, ideal, and reaeational activities should be included, because 
all are important in securing and holding some types of employ- 
ment, and in achieving a well-balanced life Furthermore, considera- 
tions of ability, financial rewards, and opportunity are usually to be 
avoided in appraising interests since these may not be closely related 
to intrinsic satisfaction in a type of work This counseling approach 
has been effective in producing scales which with few exceptions have 
only small correlations with ability and choice of a specific occupa- 
tion among the gi'oups studied, but which seem to predict long-time 
trends in satisfaction in fairly wide areas. 

The simplest inventories are short check lists or blanks for in- 
dicating one's preferences. The most elaborate include about 1,200 
items of various kinds. In some cases blanks are to be filled in by a 
counselor after an interview and in others, a person is to rate him- 
self. Both interviews and questionnaires have good and bad points. 
The questionnaire allows a person to check a larger number of items 
than could be considered in a short interview, but an interview often 
yields a more coherent account of developments and conflicts be- 
tween goals The questionnaire is free from whatever effect the inter- 
viewer's personality may have, but it lends itself to intentional mis- 
representation which may be reduced by a careful interviewer. An 
inventory often shows relative strength of preference more clearly 
than an interview, because it allows a quantitative comparison with 
the ratings of large groups of persons. 

^ Today there are at least thirty published questionnaires in the 
^United States, The five which will be described here are by Strong 
(1938), Kuder (1942 and 1948), Lee and Thorpe (1943), Thurstone 
(1948), and Guilford-Shneidman-Zimmerman (1948). These are not 
necessarily the best available for a particular situation, but illustrate 
different approaches and arc based on a good deal of careful thinking 
and research 

Strong Vocational IntereU Blank {J938) Stiong published his 
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first edition in 1927, a revised edition for men in 1938, a large book 
of results and conclusions in 1945, and a revised scale for women in 
1947. He used some of the items and followed closely the same meth- 
ods used by Ream (1924) and Freyd (1922). The present men’s and 
women’s blanks consist of four hundred items each (of which 163 
are identical items) divided into eight parts as follows: 


100 names of ocinpations 
36 scliool subjects 
40 names of amusements 
48 activities 


47 jicculianties of people 
40 order of picference of atlisiues 
40 coiTii)anaons ol tuo activities 
IcO self-ratings of chaiac tenstics 


These items were selec ted to give wide t ovei age and foi their power 
to indicate differences in interest among a large vaiicty oL occupa- 
tional groups It v\as found that the average higli school and college 
student would check the total bLink in about 40 minutes. The diiec- 
tions ask for first impressions, and instruct the subject not to ponder 
long over any item Scoring keys are now* available lor 38 male oc- 
ciiparioris and 24 female occupations, for 6 occupational groups, and 
for matuiity of interest, occupational level, and masculinity-feminin- 
ity 

The men’s schedule was standardized in 1938 on groups ol men 
who w'cre considered to be successful in a particular occupation 
These men, whose average age was about foity-thice years, had been 
engaged in one occupation at least 3 years Most of these groups in- 
cluded fiom five hiiiiclred to one thousand men, each ol whom earned 
52,500 a yeai or more. ^ 

The procedure for making a scoring key was the same for each 
occupational or other group First the percentages of those who 
marked each item like (L), indiffcrciit (1), or dislike (D) was found 
and compared with similar percentages for a large general group 
The differences between the special and the geneial groups were 
found for each item Foi instance, »n calculating the weights to be 
assigned to the first item, “actor," w'hen the intciest oi personnel 
managers was to be evaluated, the pci cents of personnel managers 
w’ho cliccked each response were compared with the per cents of the 
general group, thus- 



L 

I 

D 

Personnel managers 

49 

38 

13 

All others 

38 

35 

27 

Difference 

+ 11 

+3 

“14 


For ease of handling, the differences were transposed into smaller 
figures ranging from +4 to —4, so that the final weights for this 
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ILLUS 187 ENGINEERING INTEREST SCALE NORMS 


513 Engineers 


Percentiie Scores 

Raw 

Standard 

Rating 

Engineers 

306 Stanford Z85 Stanford 

Score 

Score 



Freshmen 

Freshmen 

220 

76 

A 

99 



210 

74 

A 

99 



200 

71 

A 

99 



190 

69 

A 

98 



180 

67 

A 

96 

99 

99 

170 

64 

A 

93 

99 

99 

160 

62 

A 

89 

99 

98 

150 

60 

A 

83 

98 

96 

140 

57 

A 

75 

97 

95 

130 

55 

A 

66 

95 

94 

120 

53 

A 

57 

92 

92 

no 

50 

A 

48 

88 

88 

100 

48 

A 

42 

87 

83 

90 

45 

A 

33 

83 

80 

80 

43 

B + 

24 

79 

76 

70 

41 

B + 

17 

75 

73 

60 

38 

B 

14 

69 

68 

50 

36 

B 

9 

63 

66 

40 

34 

B- 

6 

59 

62 

30 

31 

B- 

4 

53 

54 

20 

30 

B - 

2 

48 

44 

10 

27 

c + 

1 

42 

41 

0 

24 

c + 

1 

37 

35 

-10 

22 

c 

1 

31 

27 

-20 

20 

c 

1 

25 

22 

-30 

17 

c 


20 

18 

-40 

15 

c 


14 

12 

-50 

12 

c 


10 

9 

-60 

10 

c 


7 

7 

-70 

8 

c 


5 

5 

-80 

5 

c 


2 

2 

-90 

3 

c 


1 

2 

- 100 

1 

c 


1 

2 

- no 

-2 

c 



1 

- 120 

-4 

c 



1 


(Strong, 1938, p 10. By permisaion of the Stanford University Press ) 


item were L = +2, 1 = +1* ^ individuars score is 

the sum of his plus and minus marks on all items. If his score falls 
within the highest 69 per cent of the scores of personnel managers, 
he is given an A rating in this field. If his score falls within the next 
lowest 29 per cent of the scores, he is given a B rating If it is lower 
than the score obtained by the lowest 2 per cent of the personnel- 
manager group, he is given a C rating* 

The A rating indicates marked similarity of interests ivith those of 
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persons successfully engaged in an occupation- The B rating shows 
some similarity, and the C rating, no similarity or contrary tenden- 
cies. An individuaFs score may also be transposed into standard scores 
or centiles from norms, such as those shown for engineers in Illus. 
187. The blank must be scored once for each occupational scale de- 
sired. Hence, if scores are desired for thirty-eight occupations, it is 
necessary to score the blank thirty-eight times The labor involved 
is large, whether the scoring is done by hand or machinery. A report 
blank (Ulus 188) gives a profile which indicates both strength and 
direction of interest. 

Three uses for these profiles are suggested by Strong in hiring 
employees, in admitting students in college, and, chiefly, in aiding a 
person to decide upon which occupation to enter. A low score in 
interest of musicians, tor instance, is taken to mean that one would 
probably not be satisfied in this occupation A high score presumably 
means that one's satisfaction will be as great as most of those in the 
profession It one has several high scores, some occupation may be 
found which will combine these interests. 

Strong (1938) reports fairly high reliabilities The mean correla- 
tion of odd-even items was ,87 for the thirty-six occupational scales 
based on records of 285 Stanford seniors After an interval of one 
w^eek the retest correlation averaged approximately 869. The mean 
retest con elation on tw^enty-one occupational scales over a 5-year 
period w^as .75 for a group of 285 college men This is a high figure 
for retests of this kind 

The intercorrelations betw^een scales for 273 seniors showed a 
number of scales to correlate higher than .60 Strong has placed these 
in groups wdiich, for the most part, seem logical. (They are discussed 
on page 584 ) 

Kuder Piefeience Record (1942, 1948), Three scales, outdoor, 
mechanical, and clerical, ^have been added to the seven previously 
established in 1939, namely, computational, scientific, persuasive, 
artistic, musical, literary, and social service. Instead of 330 2-choice 
items, the 1942 form has 168 3-choice items, yielding a larger number 
of scores All the items have the same form (Illus. 189). 

The testee indicates his choices by pushing a large pin through a 
4-sheet folder. Seven sides of these pages are printed with scoring 
keys. To secure scores in each area of interest, one simply counts the 
number of circles in the key which have pm holes. One point is al- 
lowed for each. (Another procedure uses a machine-scored answer 
sheet.) A V score indicates validity or adequacy in following direc- 
tions on the test If the V score is below or above the limits set by 
experience, the student made careless mistakes, or he is an extreme 
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ILLUS. 188. STRONG VOCATIONAL INTEREST TEST PROFILE 


HANKES KNlftr FORM FOR— 

STRONG vocational INTEREST TEST - MEN 

Sm tllwr ii4« ftr tsFlMutmi 



(By permission of E K Strong and the Stanford University Press) 

Note: Scores to the right of the shaded area show interests similar to those in the 
occupation; scores to the left show less than average interest. 


















566 


DYNAMIC PATTERNS 

ILLUS. 189. KUDER PREFERENCE RECORD, VOCATIONAL 

This blank is used for obtaining a reconl of your preferences It is not a test. There 
arc no right or i\rong answers. An answer is right if it is true of you. 

A number of activities arc listed in groups of three. Read over the three activities 
in each group Decide which of the three activities you like most. There are two 
circles on the same line as this activity Punch a hole with the pm through the left- 
hand circle following this activity Then decide w*hich activity you like least and 
punch a hole thiougli the right-hand cncle of the two circles tollowing this activity 

In the examples below, the person answciing has indicated for the first gioup of 
thicc iiLtiviiics, that he would usually like to visit a museum most, and browse in 
a Ubrarv least In the second gioup of three activities he has indicated he would 
ordinauly like to collect aiitogiaphs most and collect butrcrflies least 

E\\MPU:S 

Put your answers to these questions in column O 

P. an art gtillery 

Q Biow'sc in a library 
R. Visit a rnuseuin 

S Collect autographs 
T Collect coins 
U Collect hut tei flics 


(By permission of C F Kuder and the Science Rcscaicli \ssociaccs, Inc) 

deviate. Raw scoies «iie transfeired to a piofilc sheet (Ulus 190), 
which yields centiles based on large groups oi students in the last 3 
years of high school 

Separate norms are furnished for men and women Men and high 
school hoys average higher in mechanical, computational, scientific, 
and persuasive interests, while the women and high school girls have 
higher raw scores in artistic, literary, musical, social-service, and 
clerical intciests. The sex differences are a little smaller among high 
school students than among adults. The adult male sample shows 
significantly higher scoies than the sample for high school boys in 
persuasive and son al-sei vice interests and slightly lower scores m 
musical and scientific interests Adult women and high school gills 
have similar means and standard deviations 

To aid in interpreting the scores, Kuder has provided a table of 
occupations listed according to major interests II youi preference 
profile show's only one high score, the occupations listed in that area 
should be given special consideration If the profile show's tw’o high 
scores a list of occupations which combines these two areas is pro- 
vided. If there are more than tw'o high scores, pans of high scores 
should be considered in turn. If tliere are no high scores, that is. 
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> a« U lA. Cwr<|W|ta*eM 


(By pet mission of G Ficdcric Kuclei and ilic Science Research Associates) 


none above llie 75Lh cenLiIc, then lower scores arc considcicd with 
reservations If all scores are near the medians — a very rare occur- 
rence — the person's picfeienccs ma) be evenly balanced, or he may 
have no w^elhdeveloped interests, or his interests may fall in the areas 
of personal service or manual labor, w’liich are not scored on this 
form 

Very low scores are considered by Kuder to be indicators ot areas 
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to be avoided. The counselor should also ascertain whether a person's 
interest is in appreciating or in working in an activity. Many persons 
who show fairly strong interests in art, music, literature, social prob- 
lems, and science legard them as hobbies lather than vocations 

One should also consider only the occupations lor which a person 
has ability as shown by aptitude or achievement. “For example, 
electiicians and electiical engineers piobably have similar pieference 
profiles, but would be expected to differ materially in ability.” 
(Kucler, 1912, p 6.) 

Kuder took unusual care to develop scales that are independent 
of each other and leliable His original scales (1939) were prepared 
by placing items with high conrelations together, and eliminating 
items which had nioie than one high con elation with items in other 
fields His fiist scale, that foi literary interest, was developed with 
great internal consistency, and the other scales were giadually devel- 
oped to include pure measures of independent areas The last two 
scales, mechanical and clerical interests, could not be prepared with 
as gieat independence as the otheis which \^ere already established, 
because independent items were not found 

The reliabilities for the separate scales, when repeated after 3 days 
with graduate students, were all above .90, and other leliability cor- 
relations for groups of grade school, high school, and college stu- 
dents, and employed adults averaged close to 90 

The intercorrelations between separate scales were found for six 
groups high school girls and boys, college men and women, and 
employed men and women 1 hese showed a slight tendency for the 
male groups to have higher intercoi relations than the female, but 
there were no significant differences between age groups or between 
employed and unemployed groups. Disregarding signs, the range of 
correlations was from .00 to 56 with the over-all median in the 
neighborhood of 20 The median correlations between some of the 
scales are given in Ulus. 191 

Kuder also presents scale-score means and standard deviations for 
forty-five small occupational groups of employed men, and for 
twenty-three groups of employed women. Graphic profiles of the 
same groups are also given in order to facilitate a comparison be- 
tween an individuaFs profile and that of a group in which he may be 
interested 

A masculinity-femininity score has been developed by weighting 
each raw score by a multiplier and adding the weighted scores The 
multipliers were determined by comparing the function of each 
scale in discriminating between the sexes Thus, the mechanical 
scale has a weight of plus 73, the computational, plus 101, while the 
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ILLUS. 191 CORRELATIONS BETWEEN RUDER INTEREST SCORES 
FROM THE MANUAL, RUDER VOCATIONAL 
PREFERENCE RECORD, 1942 


Correlation 

Men 

Women 

Positive 

Mechanical-Scientific 

42 

M 

Clerical-Compu tational 

,49 

.43 

Negative 

Mechanical-Literarv 

-.44 

— 31 

Mechanical-Social-Service 

— 20 

— 22 

Mechanical-Clericdl 

— 30 

— 14 

Scientific-Persuasi\ e 

— 38 

— 38 

Scientific-Musical 

— 29 

— 29 

Scientific-Clerical 

— 24 

— 20 

Artistic-Pei sudsi\ e 

— 20 

— 15 

Artistic- Literary 

— 20 

— 15 

Artistic-Social-Serv ice 

—.25 

— 30 

Artistic-Compii tational 

— 26 

— 24 

Clerical-Artistic 

— 25 

— 29 

Clencal-Social-Service 

—.20 

— 23 


(By permission of G F Kuder and the Science Research Associates, Inc ) 

artistic scale has a minus 74. A high plus score indicates masculinity. 

A score for a specific occupational group, accountants and auditors, 
has been derived by a similar procedure, that is, by weighting each 
raw score according to the degree that it distinguished between peo- 
ple in general and the persons in this group. Whether or not these 
weighted scores will be of more value than the profile is still to be 
demonstrated. 

Occupation Interest Inventory, Lee and Thorpe (1943). The au- 
thors state that "the major purpose of this inventory is to aid in 
discovering basic occupational interests possessed by an individual 
in order that he may become or remain an interested, well-adjusted, 
and effective person, as well as a profitable employee.” They believe 
that interests are associated primarily with certain types of activities, 
not with occupations as such, hence the form consists of items which 
usually require a choice between two or more activities The form 
yields scores for six fields of interest, three types of interests, and a 
level of vocational aspiration. The classification of items according 
to fields, types, and levels, is "based upon the obvious nature of the 
items themselves.” 

The six fields of interest are each indicated by 40 items; 10 items 
representing a low skill level; 20, a medium level, and 10, a high level. 
In preparing the items, a low skill in one field was paired with a low 
skill in another field, a medium with a medium, and a high skill with 
another high skill. Eight choices in each field were matched with 



570 


DYNAMIC PATTERNS 


eight choices in each of the other fields. Thus, the deciding factor is 
the field of activity rather than the level of complexity. The six fields 
of interest are: 

1. PersonalSodal. personal service, social service, teaching, law enforce- 
ment, health, and medical service. 

2. Natural, farming, forestry, and lumbering activities 

3. Mechamial machine operation, repairing, construction work, and 
designing activ ities. 

4. Business, clerical, bookkeeping, accounting, sales, supervision, and 
management. 

5. Arts’ painting, drawing, decorating, landscaping, drama, literary, and 
musical activities 

6. Sciences, laboratory, engineering, chemistry, biological research, and 
physics 

The three types of interest scores, which are secured from the same 
items, are verbal, manipulative, or computational. The verbal in- 
clude oral, reading, and writing activities in sales, business, teaching, 
and the arts. The manipulative involve gross and fine dexterity, such 
as in shipping, crafts, typing, and surgery. The computational in- 
clude use of numbers in business and science. 

The level-o£-interest scores are derived from thirty additional items 
in which three levels are contrasted in the same field of interest. 

The form is scored by simply adding items which are coded simi- 
larly on the face of the form, and then changing the raw score to a per- 
centile and placing it on a profile sheet (Ulus. 192). This hand-scoring 
operation takes about 15 minutes per person. The percentile norms 
are furnished for male and female separately. The females average 
significantly higher in personal-social, business, arts, verbal, and com- 
putational, while the males seem more interested in natural, mechan- 
ical, sciences, and their level of interest scores are higher. The sexes 
show nearly the same results on manipulative scores. 

The authors report retest reliabilities after 4 weeks, using the 
same form on one hundred twelfth-grade students, as at or above 
.88, median about .90. 

T hurstone Interest Schedule {1948). This form consists of a single 
1 1- by 17-inch sheet divided into one hundred rectangles (ten columns 
of ten each). In each rectangle two titles of professional or semi-pro- 
fessional occupations are printed. The subject is asked to circle the 
preferred title and to cross out the disliked title, or if he prefers, both 
titles may be circled or crossed out. Thus, there are four hundred 
different answer combinations 

Ten vocational fields were selected, including Spranger’s six life 
interests as well as other categories which factorial analyses have 
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ILLUS 192 S\MI>IL I’ROIIIE, OCClPVriONVI- INTKRFST 
IX\ FMOR\ 


Paul Brown 
Lincoln High School 

muc or MTtum 


A fS . 

I 

C MmI. 
D •» 

1 A 
F Sd 
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Z Uu«p * . , 

Z C*Mf * 4 



uvu or MTfum 

fc-4L+.X»u!2.i=^^ 
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TAiJ L'Lj*r-*ioi» ffie-tr tie lisFien,- 

tnry data for u tt-elj'r^^raue boy a<u2 
SI oztT tlrt }• s r’ai'<r t xttrest xs in c&« 
ju^" f ^ .Atf Stirrc'j H s lecr^d h.\^ 
rst interest s Mecnanirel 7nd tne t»a 
lowest a-e in tne Persm'dSortal and 
haturel fields In types of interests 
kts choices ere kigk in the Mantpu- 
le' t,e an'^ he eppeurs to avoid ekmces 
requiring VrroU actmty Fis level 
o; r e^eiis tnJM'es a preference for 
ec* zities of moderate dt‘ticuliy 
B e snoJL. expect that ke tbou^d 
be more tPterrs*eJ and successful in 
r'p'Cts of the sc enee field which are 
as.oc\a*ed *,\th eons*ruetxon ard mr- 
rAirrir-^ puriu ts at d that ke should 
O'obabty avoid these tn u.hxk talk- 
tng or wnt ng and dealing with people 
ts a prttnary requirement 

in adequate interpretation in this, 
as in all instances^ requvet that Oe- 
cupet anal Interest Inventory data be 
supfJenenied by tnformc^ton regard- 
1-5 n^rtcl maturfy, pKjs eel con- 
a ••on, pfsonalt'y, specif aptitudes 
and abdil es, and eaucat onal baeJk- 
ground 


(Lee and ThoijK*, 1 9-1", p 7 B> pci mission of the auLlioii and the California Test 

Hu I can ) 


shown to be independent and which seem vocationally significant. 
They are: 


P.S. ph)sical science 
B S. biological science 
G computational 
B business 
E execute e 


P persuasive 
L linguistic 
H huniamtariari 
A artistic 
M musical 


This is d slioit check list designed to be used when an lionest ex- 
pression of choices can be expected. Its purpose is not disguised. 
It requires less than 10 minutes ol the subject's time It gives a piofile 
of ten scores and the scoring, which can be finished in about 2 min- 
utes, requires no stencils or other equipment The iiuciprclation is 
made immediately from a piofile sheet, punted on the last page of 
the form. 

A separate mechanical category is not included, since it is thought 
to be well repicsented in physical sciences 'Ihc computational oc- 
cupations include those which use the results of computations, for 
example, tax specialist, rather than actual computational work, such 
as bookkeeper The linguisiic category repicscnis skill in communica- 
tion rather than in literature. 

The reliabilities foi each scale, as shown by two split-half methods 
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using two hundred cases, were all at or above .90. Also, each item 
was correlated with the scale to which it contributes. The item 
validities ranged from .30 to .95 and averaged about .78, which is 
high for this type of scale. 

The intercorrelations between the ten scales for a group of two 
hundred men, all high school graduates, ranged from .37 to 68. The 
highest of these, as might be expected, were: Business-Computation 
.57, Business-Executive .64, Executive-Persuasive 68, Persuasive- 
Linguistic .51, and Artistic-Musical .49. 

To find a score for a category one simply counts the circled items 
in the proper row and column. The unmarked or crossed-out items 
are ignored. The profile sheet (IIlus. 193) is made by simply marking 
these raw scores in the columns for each occupational field. 

ILLUS 193. THURSTONE INTEREST SCHEDULE PROFILE 



(By permission of L L. Thurstone and The Psychological Corpoiation) 

In the interpretation of an individual profile, Thurstone states 
that since the raw scores are directly comparable, a person's relative 
strength of interests can be seen without comparison with group 
norms The ten scores are arranged in a chart with the most analytical 
occupations on the left and the more social and artistic categories on 
the right. The general slope of the profile, therefore, gives a rough 
indication of these broad types of interests. 
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TJie Cijiilfoid-ShiiCKltnan-Zwime} man (G-iS-Z) InteieH Survey 
(7P/<9) This iorm differs from those alre*icl\ described in tfiat iL 
secures sepaiaic scores ioi hoI)l)ics and lor \ocatioiial mteicsts by 
having one indicate, for each of “GO items, 'v\hetlier it represents a 
hobby, a \oratioiial intcicst, both or neithci 'J'he items are scoied to 
yield rentilcs in nine main dnisions and eighteen suhdiMsioiis, as 
shown in Ilhis IfM. 

I’he iieins aic all short phrases, foi e\ample, “fudge entries in a 
photo contest” and “Dnect an oiihestta or a band ” 

1 ach Item ih scored for only one trait, .irid twenty items are used 
foi each interest area "Fhe picsent Koim V is the result ol an mieinal- 
consistency item analysis on 540 items, using thiec hunched college 
men Tlic iwcnty items in each held which showed the highest con- 
sistency were letained 

The G-S-/ Iniei'Cst Sin\ev ma\ be administoied either to iridiskl- 
uals or to gioups and should he completed in about 45 minutes, but 
there is no lime limit Ihe examinee is instiiutcd and icciuiied to 
keep the answ'ers he has gnen coscicd while working on e*uh new' 
column so that he can locate the propei sj)accs on the answer sheet 
and also present eailier answ'Cis lioni influencing later ones A nos el 
scoring meihod using location on the aiiswei sheet makes it possible 
to score the sshole sheet and prepare the pioftle in about 3 min- 
utes 

Norms are fui rushed for high school and college students, each 
sex separately Split-hall reliability coelficieius for tsvo huncli'cd high 
school students langeci from 60 to 05, median .87 Intercouclations 
betsveen the nine categories (not giseii) ai'C expected to be small, and 
between hobbies and vocational ml crests large It is interesting to 
note Irom Ulus 101 that most of the stuclents check many riioie items 
as hobbies than as vocational interests. 

PRACTICAL RESULTS 

Duiiiig the past lew years the application of scales, such as those 
just described, to many thousands of persons, most of them students, 
has piobablv had far-reaching results Just the act of filling out a 
questionnaire may make a person moic interested in analyzing him- 
self and in learning about occupations The discussion of interest 
scores, whether admitting their limitations or exaggerating their 
vaJues, has doubtless led to more thorough locational planning, and 
more satisfactory adjustments among many persons. Foi the number 
of questionnaires administered, suiprisingly little m the way of re- 
sults IS yet reported Aside from the measuicmciit of groups to cstab- 
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IIXUS. 194* PROFILE: GUILFORD-SHNEIDMAN-ZIMMERMAN 
INI EREST SURVEY 

jm o-M uaeaest suRy»r - iwits acsr • hzqb school norms 
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(By permission of the authors and the Sheridan Supply Company. Copyright 1948 ) 

lish norms, most studies have been based on samples of less than one 
hundred persons, often less than fifty. The results of applying inter* 
est questionnaires will be discussed under four headings* age and sex 
differences, predicting success in school, predicting vocational suc- 
cess, and correlations with personality measures. 
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Age and Sex Differences 

Pjobahly the most, careful studies are those reported by Strong 
(1945), who fountl that likes tor a particular item remained much the 
same o\cr ^\ide age langC'^, but differed ronsidcrablv bct'iseon sexe^j 
'When educational and occupational levels v\eic held lairlv constant, 
he found that about two filths of the items oJ the Stiong Hlank for 
Men showed straighi-hne increases oi dccieascs in peicenrages of 
likes from fifteen to filty-five \eais vSome ol these items l<n example, 
aviator, decreased considerably (from 60 per cent to 20 pci cent) 
Others, for example, raising flcmeis and vegetables, increased horn 25 
per cent to 60 j^er cent The majority of items showed less vaiiation 
Another two filtlis ol the items showed cin"vilineai patterns, either 
rising from filtecn to twentv-fivc yeais and then declining, as was 
the case with playing tennis, or decreasing lioni fifteen to twenty-five 
years and then i ising, as with fishing The other one filth of the iLcnis 
showed different age curves for diflereni groujrs ol men 

In spile of these* changes, the rank-order con elation oi item posi- 
tron was 82 between filtecn and twenty-five years, 88 between Lvseiity- 
fivc and filly-five vears, and 73 between fifteen and fifty-five )e«iis, 
which means that items that v\crc well liked by lif teen-yea r-oJds were 
also w'cll liked by tw^enty-five- and filty-fivc-year-olds. 

Stiong also reported a tendency foi total likes to increase from 
fifteen to twenty-five years, and to decrease slightly thcrcalter The 
increases were principally in those activities which arc least laniiliar 
to fiftceii-year-olds linguisiic activities and occupations, self-rating oE 
present abilities, school subjects, influencing other's as in teaching, 
supervision, sales, and cultural amusements Little difference was 
found between fifteen and twenty-five years in likes loi physical skill 
and claiing, woiking conditions, working with things, mechanical 
pm suns, iioiiciiltural amusements, and unfoitunate people The 
slight decreases in likes between twenty-five and fifty-five ycais weie 
in physical skill and daring, wiiting activities and occuijalions, and 
iiiteiference v\ irh established habits 

SiiTuIariLics ol sexes, as shown by rank-order conelations of average 
likes oi each sex lor various items, are given in Ulus. 195 These fig- 
ures, which are based on reactions of fiLteen-, twenty-five-, and fifty- 
fivc-year-olds ol both sexes, show that the averages of the two sexes are 
most alike in then self-iatings of Piesent Abilities and their likes for 
Kinds of People and foi Comparisons between Items They aie 
least alike in preferences foi School Subjects, Occupations, Activities. 

Nearly all authors of scales lepori that when the same scale is used, 
men and boys show higher average scores in scientific, mechanical. 
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ILLUS. 195. COMPARISON OF INTERESTS, MALE AND FEMALE. 
FROM STRONG. 1945 



Correlatioiis; Male vs Female 


Ltkes 

Attitudes * 

100 Occupations 

.28 

36 

36 School Su!)jccts 

26 

27 

49 Amusements 

66 

69 

48 Actnities 

27 

28 

47 Kinds of People 

96 

96 

40 Older of Preference of Activities 

50 

49 

40 Comparisons betiveen Items 

82 

87 

40 Present 'Vlnlities 

95 

95 

400 JLntiie Blank 

.69 

71 


• Attitudes is the a\erjge hke-inmus-dishke Score 
(Repi lilted fiom locational Inlet ests of Men and Women by Edward K Strong 
with the permission of the aiithoi and of the publisheis, Stanford University Press ) 

and computational interests, while women and girls average higher 
in musical arts and social service interests. Persuasive and manipula- 
tive inteiests seem to be more equally distributed between the two 
sexes, depending somewhat on the particular groups under con- 
sideration. Thus, Strong (1945, p 229) reports that purchasing agent 
and vacuum cleaner salesman correlate positively with male interests, 
while life insurance salesman, advertising man, and lawyer correlate 
significantly with female interests Girls are thought to be slightly 
more mature in their academic and vocational choices than boys. 

Predicting Success in School 

Thorndike (1912, 1917, 1921) reported correlations between stu- 
dents’ self-ratings of interests in seven educational subjects and their 
self-ratings of abilities in these same subjects. He found correlations 
by a rank-order method of approximately .89, and concluded that 
interests were highly predictive of abilities. King and Adelstein (1917) 
reported that on two different occasions they found correlations of 
approximately .73 between self-ratings for interest and for ability. 
Fryer (1927) made a similar study of college students and reported 
correlations of approximately 60 when the number of academic sub- 
jects was not limited. These correlations are probably spuriously 
high because they are all based on self-ratings. 

Terman (1925) compared self-ratings of interest in school subjects 
with teachers* estimates of ability in the same subjects, among 527 
normal children and 643 gifted children with IQ’s of 140 or higher. 
The separate correlations for boys and girls in each group averaged 
approximately .417. 

Garretson (1930) found only insignificant correlations between 
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academic inteiests measured by his blank and acadenuc grades in 
high school and scoies on Ternian's Group Test of Mental Ability. 
Between commeicial interest and grades in coinnieicial subjects the 
correlanon \vas also zero Betivccn technical inteiCbt and grades in 
technical subjects the correlation \\as 20 

Strong (1015, p *321) repoits that among l-ll Iicshinan dental stu- 
dents who completed the blank there wore no ddfeiences in scholar- 
ship found bet\\ccn those rating A, B-',-. B, oi B— on the demist 
scale, but students latmg C had inlcTioi gi'acles Ot those rating V or 
B-f-, howevei, 02 per cent graduated ol those lating B or B— , (>7 ]rer 
cent; and ol thoDC rating G, onh 23 per cent graduated. Moreovci, the 
fifty students who rated A on the dental scale usual!} had high inter- 
est in the occupations which correlate ovei 50 with deiitisti) — 
physician, chemist, engineer, while the opposite was tiue ot tho'sC 
with C interest ratings 

Super (1947) stiniman/ed se\cn reports by otlieis ol relations be- 
tween the Kucler Scores and school giacles The results show’ coiiela- 
tions among small groups of liom nearly 00 to .(>0, median about .30, 
when course grades are compared to inteiesi scores in a similar area. 
Correlations are slightly higlici lor boys than for girls and loi sci- 
entific than for nonscientific subjects In some instances the langc of 
interest scores is so small for a gioiip that the coriclations are not 
indicative ol the true relationship 

Bolanovich and Goodman (HH4) found that although interest 
scores yielded low' corielations with grades of women trainees m 
electrical enginecung, those who coiuinued training showed Jiighcr 
Kuder scientific and computaiional interest, and lowei pcrsuasise 
interest than those who dropped the course for vaiious reasons. 

Attemjjts to pieclict inteiest areas fiom intelligence-test scores, or 
vice vcTsa, have usually resulted in insignificant conelations. When 
the vocational choices are arranged in order ol complexity oi amount 
of training needed, how'ever, students in laige populations ha\e a 
tendency to choose occupations at or a little above iheir ability levels. 
Such choices do not neccssaiily reflect primary interests, tor the} are 
also influenced by practical considerations. 

It seems that iniercbt scores do not predict scholastic achic\cmcrit 
well, since this is deteiiiiinecl to a large degree by ability, industi'y, 
and previous preparation In the long run, how'ever, interest does 
seem to have a marked effect on completing a course ol study. 

Predictions of Vocational Success 

A w'idely accepted bcliel is that enjoyment of a type of activity 
plays a dominating role botli in the selection of an occupation and 
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in continuous and successful employment in it. This belief is neither 
widely confirmed nor denied by the present results. Many persons 
doubtless succeed in a vocation in spite of early or continued dis- 
likes. Reasons for occupational choices often include financial and 
moral considerations. One would not expect, therefore, that a state- 
ment of probable enjoyment of a type of work, which a person may 
never have tried, would clearly predict success in it in later years. In 
order to e\aluate the part played by interest in vocational success, 
one would have to study a large group of persons over a period of 
years, and note in what ways and to what extent interests influence 
vocational choices and success. 

Data of this sort are meager. One study by Strong (1945) reports a 
10-year follow-up of college seniors. Of 400 men who were requested 
to complete the form, 287 complied in 1927. Of these, 223 returned 
another blank in 1932, and 197 in 1937 Of the 197, 39 could not be 
followed, because they enteied occupations for which Strong had 
prepared no scales. Of the total, 99, or 50 3 per cent, were sure of 
their choices in 1927 and made no occupational changes, while 41, 
or 20.8 per cent, w'ere sure in 1927 but nevertheless had changed oc- 
cupations by 1937. There were 17, or 8.6 per cent, who were not sure 
of their choices in 1927, but made no change, while 40, or 20 3 per 
cent, were not sure but by ten years later had changed. Many of the 
changes were normal types of vocational development, however, as 
from engineer to physicist, production manager, teacher of mathe- 
matics-science, or sales manager. Also, the interest of the occupation 
to which he changed was in many instances nearly as high as that 
of the occupation left, for there was a mean diiBEerence of only 3 4 
points in the Standard Score Scales College students usually have 
three or four fairly high scores on the Strong Blank, and the occupa- 
tions on which these scores are made may have much in common. 
Hence, it is clear that this study indicates more continuity of interest 
than is indicated by the percentage w^ho changed occupations. More- 
over, the original choice was usually not the occupation on which 
the highest score was made, but the occupation chosen had a median 
rank of 2.2 for those who continued in that occupation 10 years, and 
of 2.9 for those who changed from it. The “changed-to" occupation 
ranked 4.2— one of the five highest scores The prediction among 
seniors is good, therefore, that their score which is one of the highest 
five on the Strong Blank will indicate the occupation 10 years later. 
These predictions would doubtless have been more accurate, if the 
study had reported the scores for rather broad classes of occupa- 
tions. 

Another study by Strong (1945) reported a follow-up study of 174 
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college freshmen over a period oi 9 yeais. '1 he results were similar to 
those of the study rued abo\c, but about 50 jjei cent ol the freshmen 
changed their orrupdtional choices during the 9 veais 

Hahn and Williams (I91‘3) found that among Maiine Coips women 
resei'MSts the typists, stcnogiapheis. and general cleik** wiio were 
satishcd with their work made signifiraiitlv higher kudei clerical in- 
terest scores tlian those who w’ere dissatisfied 

An interesting application ol an inteiest cpiestioiinaire to an in- 
dusLiial situation is that repoited bv liolanoMch (1918) loi gioups of 
inexperienced w’omen in lactorv jobs In order to c^iablisli a scale, 
271 Items weic applied to bfifi women Seven months later each 
item was studied to clef ermine how' w'ell it differentiated women who 
quit within 3 months from those who stayed on the job more than G 
months Weights were then assigned to each ol 11 1 items of plus or 
minus one oi two points From these weighted scoies a fairly ac- 
curate piccliction ol tendency to stay on the job was made It was 
estimated that tin novel coulcl be reduced per cent during the first 
3 months The interest blank applied at another plant in a dilleient 
state was found to work equally w'ell there, that is, turnover during 
the first 2 months ol emjdoyment, w’liich v\as very extensive, could 
be cut in half by selecting only those applicants who received scores 
among the highest hO per cent on the interest scale Contrary to 
expectation, the workers who stayed on the job did not say they liked 
activities and conditions similar to those in the lactoiy, but rather 
that they liked very simple activities that w'ere fiee from any re- 
sponsibility for thought OT application. 

Correlations wdth Personality Measures 

Since almost all interest questionnaires include items regarding 
personal adjustment, and since a satisiactoiy job is often a large 
factoi in satisfactoiy emotional balance, and since particiilai types 
of adjustment may reduce oi change alleged interests a good deal, 
the question, how' closely are intcicsts and adjustments related^ is 
extremely interesting As yet there are few reports One which shows 
a good research pattern is discussed here 

Lewis (1947) compared the Kudei Piefcrence Record scores with 
the Minnesota Miiltiphasic Personality Inventory scores (MMPl) for 
fifty white male insurance agents, mean age 44.7 years, who had sold 
insurance three or more years, and lor fifty white female social work- 
ers, iircan age 37.7, who had a median of about 8 8 )ears of experience. 
The insurance men show'ed a median centile of 93 2 on persuasive 
interest, G4 0 hr musical, 55.9 in social service All the rest w'eie lower 
than 50 The social workers show'cd a median centile of 92 0 in social 
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service interest, 72 in literary, 55.0 in persuasive. All the rest were 
lower than 51. 

In an attempt to discover relationships between interest and per- 
sonality characteristics, Lewis compared the MMPI scores of the 
highest and lowest quarters of salesman, as shown by the Kuder 
Persuasive Scale. On nearly every scale the highest quarter showed 
slightly more normal scores than the lowest, and the differences were 
most noticeable for depression and psychasthenia. A similar compari- 
son of the highest and lowest quarters of social workers showed 
smaller differences, but in the same direction. Lewis concludes that 
there is a slight tendency for those who aie less interested in their 
W’oik aiea to be less w^ell adjusted peisonally. The MMPI profiles 
for salesmen and social workers are both close to normal and fairly 
similar. Lewis thinks the differences are likely to be due to more 
psychological sophistication among the social workers. 

Correlations among Inventories 

From an inspection of the forms and their bases of construction, 
high agreements are not expected, and actually have not been found 
Super (1947) reviewed five studies of the relation between Kuder 
and Strong scores on small samples. All the correlations range from 
low but significant (Literary-Author .28) up to moderately high 
(Scientific-Chemist .73) with most of them falling at approximately 
.40 (Computational-Accountant .49). Thus, the Kuder Persuasive 
Interest emphasizes a great variety of promotional activities and 
personal contacts, while the Strong salesman must somehow close 
a deal to stay in business 

Thurstone (1948) reported the correlations between scores on the 
Kuder (1942) Preference Record and the Thurstone Interest Schedule 
shown in Ulus. 196. 

ILLUS. 196 COMPARISON OF KUDER AND THURSTONE 
INTEREST SCORES 


Kuder 

Thurstone 

Correlation 

Scientific 

Physical Science 

.62 

Mechanical 

Physical Science 

63 

Scientific 

Biological Science 

39 

Computational 

Computational 

63 

Literary 

Linguistic 

66 

Persuasive 

Persuasive 

66 

Social Service 

Humanitarian 

69 

Musical 

Musical 

72 

Artistic 

Artistic 

.48 


(By permission of L L. Thurstone and The Psychological Corporation) 
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The conclusion must be leachecl that most of the scales are made 
up or scoiccl differently enoiif^li to be saniplin« different pdlLenis, 
e\eii though the names gi\en to the scoies aie siiiiilai At this stage 
of development leseaith is piobably more desiiable than high con- 
foiinity \Mth pieccdmg work 

NEEDED RESE4RC:H 

In order to secure coopciation and accuiate stoics, an interest 
in\entory should ha\e the lollowiiig chaiactenstics 
a Lack ol ambiguity eMclcnte that the item scenes lepiesent a 
])eison accuiately is needed 

b Good co\eiage all impoitant vocational aicas should be rep- 
resented by sc])ai ate scoies 

c Factorial analysis evidence ol the relative indepcndeiirc of 
Items and ol scoies 
d A good lationiile of scoies 

e. Reliability high enough loi individual prediction 
/. Freedom lioin intentional misicpic<:eiuation 
g Fiee cxjDiessioii ol choice 
It. Optimum number oi items 
?. Validity 

Although the picvious pans of this chapter beai witness that a 
good deal of progress has been made in evaluating interests, still any 
one of the present questionnaires can be improved in one oi inoie 
of the following respects 

Ambiguity 

Questioning a little those who hav^e filled out a cjucstionnairc W'lll 
usually reveal that diffeient interpretations arc often given to the 
same item It seems probable that some ol ihe age, group, and indi- 
vidual differences that now appear may be laigely due to cliffcrences in 
the iiiteipictatioii oi items and not to dilfeicnccs in satislaction Thus, 
two persons may have the same experience and satisfaction m "Spend- 
ing the summer as a camp counselor,” but one marks the item "dis- 
like,” thinking chiefly ol how much time would be spent in getting 
the children to take care oi then clothes and obey the rules, while 
the other marks it "like,” rcmembeiiiig the hiking, outdoor cooking, 
and singing around the campfire Or two other peisons may mark the 
Item "like,” but also for dillerciit reasons One likes to work with 
children, but dislikes rough-and-icady camp life, while the other dis- 
likes caring for childicn, but enjoys outdoor life A good deal of re- 
search and revision is needed to develop items which will have the 
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same or nearly the same meaning for all those whose interests are 
being evaluated. No accurate comparison of i^ersons can result when 
items are interpreted differently. 

Coverage 

A comparison of five schedules (Ulus. 197) shows that each author 
has usually tried to measure areas which are given somewhat the same 
names, with the exception of Strong. Strong’s method of scoring does 
not yield areas directly, but occupational scores, which in some in- 
stances are easily grouped into areas by factorial analysis or other 
logic. Guilford carries the logical division to nine areas, each of 
which is subdivided into two subareas which seem to him to be 
fairly independent. Thus, Lee and Thorpe combine clerical and 
persuasive with business, while Thurstone and Guilford subdivide 
business into executive or leadership and into mercantile or man- 
aging a business. Lee and Thorpe, and Guilford combine art and 
music, while the others keep them separate. Thurstone finds good 
evidence for separating biological from physical science interests, 
while Guilford finds significant differences between scientific in- 
vestigative and scientific theoretical interests. 

Strong’s groups, based solely on correlations above 60 between 
his scales in a ^oup of college seniors, are shown in Ulus. 188. Strong 
prefers not to identify them as scientific, sales, etc., because he feels 
that at present such generalization is not well justified. For instance, 
he reports types of salesmen who do not belong in Group IX, but in 
Group V or VIII. For purposes of a rough comparison, however, 
Strong’s groups are included in Ulus. 197 to show that he has covered 
these areas to a marked degree, perhaps better than some of the 
other authors. None of the inventories yield scores for important 
semiskilled and skilled groups of workers. 

Factor Analysis of Interests 

Inspection of the various questionnaires reveals a preponderance 
of three sorts of subject matter: familiar activities, names of occupa- 
tions, and personal adjustments. 

The familiar-activity items include recreations, hobbies, travel 
and social activities, and work and work situations. It is claimed that 
they have two outstanding advantages. First, because of their limited 
scope and faimliar content, they can be answered more quickly and 
surely than less familiar names of occupations or complicated proc- 
esses. Second, it is assumed that the person who fills out the form 
does not know how the items are to be scored and hence he cannot, 
even if he wishes, intentionally misrepresent himself in order to get 
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a job or to seem more socially acceptable. Research is needed to show 
to what extent these supposed advantages are real, because there is 
good evidence that other types of subject matter are about as valid, 
and that persons m high school can intentionally ‘*warp*' their 
scores significantly 

Names of occupations usually include most of the professions and 
manageiial positions, technical woik, and skilled trades Thurstone's 
Intel est Schedule consists entirely of such items and in other ques- 
tionnaires usually fiom 25 to 33 per cent of all the items are occupa- 
tional titles or similar items. They are thought to have two advan- 
tages: First, they call attention to a whole job so that one reacts to a 
complicated set oL memories combined into a pattern. This reaction 
may be difierent and moie significant vocationally than reactions to 
the separate tasks iiecond, one may be inclined to regard names of 
occupations as possible \ocational choices rather than as recreations 
or as hobbies. li this reaction leads to a more serious consideration 
of interests, it will be an advantage. But if it means that practical 
aspects, such as wages and opportunity, are considered rather than 
intrinsic interest, it will be a disadvantage. Another possible disad- 
vantage is that the titles may represent occupations about which the 
student knows little 

Personal adjustments include being annoyed, worrying, assertion 
and submission, nervous habits, day dreaming, and health This sub- 
ject matter is also typical of personality questionnaires (Chapter 
XXII). A good deal of research is needed to determine what the 
basic personality factors are, and how important each is for a par- 
ticular occupation or vocational area. A tew factorial analyses are 
reported below, all of which, however, have failed to show patterns 
of activities or adjustments needed for an occupation, because they 
are all based on scores which combine a great variety of items. 

Thurstone (1931) applied a multiple-factor analysis to the results 
of Strong's Vocational Interest Blank for Men From the eighteen 
scores secured for each of 237 persons, four main factors appeared. 
These seem to correspond fairly well to interests m science, language, 
business, and people He pointed out that the variance of the scores 
was not fully accounted for by these four factors, and conducted an- 
other study based on the Thurstone Interest Schedule, which con- 
sists of eighty-nine names of occupations which are to be checked to 
show a person's like, indifference, or dislike toward the occupation 
named. A factorial analysis of these eighty-nine items yielded seven 
independent factors, which were tentatively identified as: descriptive 
(of persons and social situations), commercial, physical science, bio- 
logical science, legal, athletic, and academic or literary He also sug- 
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gested an eighth factor, interest in art, which was not clearly indicated 
in his study because only a few items referred to artistic activities. 
Illustration 192 shows an individual profile based on factors from the 
Thurstone Interest Schedule. 

In 1934 Strong made a factoiial analysis of his own blank, using 
both men and women. The results yielded five general factors, tenta- 
tively called an interest in people, business, intellectual activities, sci- 
ence, and language. Interest in people also corresponded to feminine 
interests to a marked degree On the basis of similar factor patterns, 
Strong grouped the occupations into eleven categories (Illus. 188). 

The Strong Blank was applied to 133 boys in high school by Carter, 
Pyles, and Bretnall (1935). The boys ranged from twelve to nineteen 
years of age, with a median of approximately sixteen. Using Strong's 
scales, twenty-three scores were obtained for each boy. These were 
correlated with one another and with age. The correlations with age 
ranged from — .1 1 for journalist to .33 for purchasing agent, median 
.06. The intercorrelations of interest scores proved to be similar to 
those found for college men, with the exception of four scales — those 
for minister, YMCA secretary, schoolman, and personnel manager. 
Previously, Strong had found these four scales to be more highly cor- 
related with maturity of interests than the other scales Factorial 
analysis of the coiiclation matijx by Thiiistone's method yielded 
three factors ’^\hich seem to correspond faiily well to those loiind liy 
the other investigators, namely, interests in peisoris, in science, and 
in language. A fourth factor, not so cleaily isolated, seems to coi re- 
spond to inicicst in business The authois point out that the sum of 
the scjiuiies of the loadings had a mean value of 83 lor seventeen 
scales Those same scales had piev louslv been repoi ted to have a mean 
uliahiliiy of 88 I1te ‘inall clifTerencc lietwccn these figures indicates 
that specific factois are small in the difierent scales 

111 spite of the snnjiarity oi lactois shown in these studies, the factor 
loadings foi the vaiioiis professional scales aic not veiy similar The 
specificitv ol items shown l>y factoiial analyses is greater for gioups 
of^nen than ol boys These results doubtless are due ro the different 
experiences of the gioups tested The inicipietatioiis of items by 
bovs are probably moie vague and also more romantic and advenfUT'C- 
some than the in terpi eta lions by men 

Dvvver (1938) found that nineteen ol the Strong occupational scores 
could be calculated from only lour scoies If the scoies ol a person 
wcic known in the field of physicist, minister, life insurance sales- 
man, and journalist then all the other scores with the excepticm of 
those loi certified public accountant and farmer could be calculated 
with remarkable accuracy. 
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Crissey and Daniel (1939) have leported a factorial analysis of 
eighteen scores on Strong’s Vocational Interest Blank for Women, 
By using Thurs tone’s centroid procedure, four factors were found 
which accounted quite well for their correlation matrix. They tenta- 
tively named these factors. 

1. Male association housewife, nurse, secretary or stenographer, and gen- 
eral office worker 

2. Interest in people * YW.CA secretary, lawyer, and teacher of social 
science 

3. Interest in language: teacher of English, teacher in general, librarian, 
and author 

4. Interest t?i science- physician, dentist, teacher of madiematics, and 
teacher of physical science 

One must conclude fiom these factorial studies that the most valu- 
able analyses will come from woik which appraises separately each 
item in a battery. AVhen a number of items are added together, as in 
the scales described above, the total score will usually not represent 
a single variable but a number of variables in unknown proportions. 
Factonal analysis of these data, then, is dependent to a large degree 
upon the particular grouping of items made by the author, and only 
indirectly upon the true relationship of the items, as shown by their 
coexistence in the same person. Factorial analysis of preference scores 
will usually give different results in different groups, since the groups 
have varying characteristics. Part of the difficulty is due to the failure 
of questionnaires to give adequate descriptions of activities or voca- 
tions. A check list of vocational names, such as banker or engineer, 
will be variably interpreted by persons according to their own ex- 
periences. Methods of factorial analysis, when used on unambiguous 
items, will show, perhaps more clearly than other methods, the unique 
patterns of interest that exist in a given group. 

An analytical approach is needed which will separate work-activity 
interests from other factors, such as personality characteristics and 
adjustments. Satisfactions w'hich are closely related to basic drives 
might profitably be separated from those which seem primarily re- 
lated to local or temporary incentives. The present scores of Kuder, 
Strong, and others, which combine several unknown factors into one 
score, give rough interest scores, but fail to analyze the independent 
personal traits which are needed for accurate counseling and em- 
ployment. 

Rationale of Scoring 

Two systems of scoring are common. One, used by Strong, yields 
a single score to show one’s position in an occupational group. The 
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score is coinputccl from a key which %\ eights all the items in the blank 
in accordance ^vitli significant clillerenres between men in-gencral 
and the group under consideration. This sssrein as'^iimcs that most 
persons in an occupation have sonic dncct ‘■atislat non in the acti\i- 
tics or conditions ol woik This assumption needs to be exploicd. 

Hoppock (1935) has shov\n a ticinendous laiige of job satisfactions, 
from a ma\iinum in certain piotcssion*: to a niiiiinuim in unskilled 
haid laboi Almost any occiipaiion iinoKcs foui oi live ddlcrent 
activities, some of which nia\ be disi.istelul An actor spends nine 
teiulis ol his time in iigoious drill. .Mso, anv occupational group tsill 
usually include persons who do quite ddleiciu ta'sks although they 
.11 c gi\cn the same occupational title Vrn single scoic lails to shots 
these diffeieiices in iiuerest patteins between ocrupauons and per- 
sons. 

Fiiithennorc, Stiong (1915, p 567) points out that the selection 
of the gioup of men-in-geiicial is a mattci of miuh importance Foi 
instance, he found that to disunguish well between tlie ^ at ions pro- 
fessions, the nien-in-general group must ha\e a preponderance of 
men in high-le\el occupations And sinulaily the cialts and trades 
can only be well distinguished when the men-in-general group is 
composed of those m lowci-Ievel occupations. When the levels arc 
mixed, the distinctions between all occupations is usually much less 
clear Thus, he finds that fcji 285 college seniors, the coi relation was 
— 42 between lawyers and accouiitarits when the men-in-gciieral 
group w'as of high level and -j- hi when it was of low level — a dif- 
ference of moie than 100 points In hh sucJi comparisons he found 
a median dillcience of about 40 points, but one thud of them were 
differences of more than 60 points These facts give limitations to 
the use of single scores computed in this fashion, and again indicate 
the need for a better qualitative analysis to show' just what the chL- 
feiences betwxcn groups are 

1 he other scoring system is used by almost all other authors of 
interest blanks It yielcls a piohle of fiom eight to ten scores, one for 
each field of interest The score for each field is secured by adding 
the likes and sometimes subtracting the dislikes lor items which are 
consider cd to be pertinent to that field The items aic chosen to 
give a wide sampling, and are grouped into fields either by correla- 
tion techniques oi by the judgments of persons who are thought to 
be w’cli acquainted with the various areas In the best scales those 
Items that cannot be agreed upon aie usually omitted, and no item is 
used to indicate inteicst in two difletent fields Ihis method at least 
attempts an analysis of relatively independent kinds ol interests, and 
should be greatly extended 
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An ideal scale of interests will also yield separate scores for (a) 
aspiration level, (b) degiee of interest, and (c) the type of drive be- 
hind the interest. 

The aspiration le\el can be secured by having a scale of complexity 
of tasks or occupations, which is based on lequired skill, intelligence, 
or some other indication of difiiculty Several ratings of occupations 
have resulted in scales ot this type, and Lee and Thorpe (1943) added 
aspiration scores to their interest scores by means ot thirty items ot 
a forced-choice variety. 

The degree of interest is assumed by most authors to correspond 
to the number of items checked in a field. This assumption can be 
shown to be {psychologically inaccurate, for a person may be deeply 
interested in a paiticular occupation, say watchmaker, but have lit- 
tle or no interest in many of the activities listed in the mechanical 
field. A separate score is needed to allow this pattern of interest to be 
well indicated. 

The kind of drive that is behind an interest is very important in 
employment or counseling woik. Four drives are common, (a) in- 
trinsic satisfaction in the activity, (b) social satisfaction in activities 
which one’s friends are doing or which they approve of, (c) vocational 
satisfaction in an activity which brings in money, and (d) escape 
from disagreeable surroundings or activities. Guilford, Shneidman, 
and Zimmerman (1948) included separate scores for hobbies and for 
vocational interests, which is a step in the right direction. 

Reliability of Scales and of Single Items 

Most of the reliability coefficients of scale scores are of the test- 
retest type, because of the lack of alternate forms and of the labor 
involved in making odd-even reliability studies. With groups of 
two hundred or more persons who have a wide variety of interests, 
the reliabilities ot scores for groups of items are usually in the neigh- 
borhood of ,85, with slightly lower figures for high school groups. 
These reliabilities are considered to be high enough for practical use, 
and are only slightly below the reliability correlations for tests of 
knowledge and reasoning skills- 

Reliabilities for single items are indicated by the tendency for 
persons to change their answers to the item on subsequent occasions. 
Strong (1945) reported that 65 per cent of his items were answered 
identically by college freshmen after one year, 60 per cent by seniors 
after 5 years, and 58 per cent by the same seniors after 10 years. 
Preferences for school subjects, peculiarities of people, and amuse- 
ments showed the least change, and activities and occupations came 
next. The most changeable were comparisons of occupations or 
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activities and sclf-ratings of present abilities There was a slight 
tendency to'iMird gi carer St *ibil 1 1\ for Liiniluii items expiessecl m lew 
woids, and also for ‘‘like’* responses. Strong found that these usual 
shiLts did not change orcupational scores significaiulv in most cases, 
because some shilrs incicased the score, othei'* den cased it, and sonic 
had no cflect 

Research is needed here to dctcimine to A\hat extent shifts or stabil- 
ity ol rcs])onses to items arc due to conij)Ic\it\ ol ioim, to ambiguities 
in interpietanon, or to real (hanges in iiitciests 

InlcntLonal Misrepi esentation 

How much (an an unsophisticated peison intcntionallv misrepre- 
sent his real mtcicsts on a standaid inteiest qucstionnaiic^ "I'spical 
of scvcial Studies is that oi Cross (1018) who admiiiiste’ied the Ruder 
Preference Rccoid to about six hundred high school seniors. 1 he 
highest scoiing stucleiits, 181 boys and 183 guhi m each ol the nine 
scales were not told their scores hut ^^crc later asked to fill out the 
reccjid again and to siiuulatc a low interest in the field in which they 
had scored highest, and a high interest in the field in which thev had 
scoicd lowest Roth sexes changed then scenes significant!) in every 
one of the scales "I he conclusion is that the recoid should be used 
only when theie is no reason to inisiepicscnt oneself Studies like this 
have discouraged the use of inteiest scales in cni[)lo)ment lor spccihc 
jobs, both civilian and military vSevenil studies ol the Strong Blank 
show that, when asked to do so stiideius and adults can change their 
scores intentionally by great amounts 

A similar t]uesuon arises hew much unintentional misicpieseiita- 
tion is hkcly in a questionnaire of this kincP It seems highly probable 
that likes which aic thought to be socially unacceptable aie some- 
times denied when they v^erc included iii a cjuestionnaiie 'Phe cliE- 
ficultv ol accurately gauging one’s own feelings is so gieat and the hu- 
man mind so complex that it is likelv that we latc oiii selves too high 
or loo lovv at many points Scvcial v\a)s of detecting misrepresenta- 
tion in peiftonality measures are diseussed in Chaptci XXII, but these 
seem to have been applied only to the Kuder Picleiciice Record. 

Free Expression of Choice 

Two forms of items ate commonly found. In one the subject is 
asked to respond to a single activity or occupational title, while m 
the other a choice must be made between two or more activities Al- 
though both loims seem to yield similar lelrability cocificients, each 
has apparent advantages and disadvantages 

'The choice items are assumed to yield more accurate icsults be- 
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cause it is alleged that a person can usually choose between activities 
with more confidence and consistency than he can rate himself on 
each one sepai ately More research is necessary to show in what situa- 
tions this assumption is true. The choice items are also a little more 
economical of space, particularly when three or more items are com- 
pared and six or more scoring combinations are obtained 

Choice items, however, often seem to force a decision between 
items for which one has the same degree of preference* A score may 
thus be the sum oi many inconsequential choices which were forced 
by tlie form ot the test, and not be a good indication of important 
preferences. Choice items also impose arbitrary relationships among 
scores since by choosing one item, two other items are discarded. 
Such relationships are not imposed by the separate-response items. 
Thus, Kuder*s scores, all based on choices between three activities, 
show approximately 70 per cent of the correlations between scales 
to be negative, while Thurstone*s and Strong's scores, in which 
choices are not important, show about 35 per cent negative correla- 
tions. For instance, Kuder's correlation between musical and artistic 
interests is — .20, while Thurstone's correlation for his scale of mu- 
sical and artistic interests is .47, and Strong's .57. This means that on 
Kuder's Record a person with high artistic interests will usually have 
below average musical interests, and vice versa. On Strong's or 
Thurstone's Schedules the opposite is true. Choice items thus influ- 
ence the shape of the individual's profile. The choice item is prob- 
ably not as accurate as the single-response item. Considerable re- 
search is needed to determine the significance of the number and 
type of combinations presented, Thurstone's method of presenting 
pairs of items, with instructions to indicate (a) a preference for one 
or (b) a preference for both or (c) a preference for neither, seems to 
have an advantage over either the choice item or the single-response 
item. 

Number of Items 

The schedule with the smallest number of items seems to be 
Thurstone's (1948) Interest Schedule, where only twenty paired com- 
parisons are used to establish each scale. Each answer is scored only 
once so that scales do not overlap with respect to basic items. Strong's 
scales, however, use several hundred answers for each scale, but no 
scale has exclusive use of an item. For instance, the scale for police- 
man has 670 weighted answers out of a possible 1,200 for the whole 
booklet. By scoring only the like responses about 61 per cent would 
be omitted, but the resulting scores correlated between .75 and .95, 
mean .85, with scores on the Standard scales, and the differentiation 
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between occupations was slightly reduced in three fourths of the com- 
par.sons. Strong believes that the slight dec i ease in scoiing time does 
not justily this Joss oE leliability and \alidity Other shortcuts in 
sampling or scoring wcie all rcpoited as uiisatislactor\ In Strong 

A comparison ol the Stiong and 7 hin stone schedules slio\\s how- 
e\cr, about ecjual leliabihties. ’ivith 7'huibtone using onh one sixth 
the number ol items Reliability is achie\cd by lack of ambiguity 
among items, and lack oL v.ination among the subjecis A good deal 
of research is needed to show which items aie least aiiibigiioiis and 
which show" differences between groups most consistentlv \ maxi- 
mum of forty satisfactory, well-scaled items will )icld icliabiUties at 
or above 00 on populations with a leasonable spiead ol scores, as has 
been demons tiatcd many times. 

Validity 

7 lie validity of a scale is ahvays a relative matier — it is \alid or 
imalid for predicting a particular situation or group of situations 
Siioiig believes his test has great validity because of three sorts of 
evidence. First, it distinguishes quite clearly those wdio are success- 
full) engaged in one occupation from men in general Illusiiation 
187 show's a large diffeiciue bctw'cen scores of success! ul engineers 
and scores ol Stanlord students Only 15 per cent of 933 nonengi- 
neering men lated A in engineering interest, and of this 15 pei cent 
many were in occupations which applied physical and maiheniatical 
concepts. Second, validity is indicated by the degiee to w'hich inter- 
est scores correspond to success in occupations Among 181 life insur- 
ance agents it w'as found that 67 per cent of those wuth A ratings wi'ote 
ai least S15(),000 w'ortli of policies a yeai, and less than 6 per cent of 
men with C ratings achieved this result A third evidence of validity 
comes fiom iccoids showing that men who continue in an occupa- 
tion obtain high intei'est scores in it 

Estes and Horn (1938) questioned the validity when they found 
icinarkdbJe differences in engiiicciing intcicst scores among gioups 
of engineering students in hve curricula. The students who were 
studying the mechanical-engineering cuniculum had a median scoie 
which was appioximately the 99th reniile in Stiong's engineering 
group, while the students in civil engineering and in chemical engi- 
iieeiing had scores similar to those of the lowest three or four per 
cent of Stiong 's Engmeeis 

In Older to evaluate the interests of these fixe groups, five new 
scales were constructed following Stiong s technique. Then it W'as 
found that when students in one curnculiim wTre scored on the 
scale of another curriculum 61 per cent w'eie rated either B — oi C. 
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The authors also calculated the correlations between the Estes-Horn 
Scales and the Stiong Engineering Scale. The scales for electrical 
and mechanical cuiricula groups correlated approximately .70 with 
Strong's Engineering Scale. The correlation with the chemical-engi- 
neering inteiest scale was — 27; with the industrial, — .63; and with 
the ci\ii, —.09. Since chemical and industrial engineers had not 
been included m Strong’s criterion group, there is some justification 
for the negatue correlations Strong did, however, include civil 
engmeeis in his group, hence the zero correlation is difficult to ex- 
plain. These results probably indicate that interest patterns vary so 
much among engineenng students that a single scale will not suf- 
fice to measure their interests accurately. Only the electrical- and 
mechanical-engineering students showed patterns of interest similar 
to Strong's sample of engineers of approximately forty-three years of 
age. 

Kuder also feels that his interest scores are valid indicators of the 
relative stiength of broad interest areas, as shown by the profiles of 
occupational groups, the profiles of women in training for specific 
occupations, and the significant differences between clerical workers 
who were satisfied and those who were dissatisfied with tiieir work. 

In order to maintain and improve these predictions continual 
research is needed along two lines. One is the refinement of the 
scales discussed above, and the other is the comparison of interest 
scores with later vocational success among carefully selected groups. 

The building of better norms also depends upon a more analytical 
approach to the criteria of job satisfaction. Many persons like some 
aspects of their work, tolerate others, and dislike still others. Hold- 
ing a job for a few years or even achieving a marked degree of suc- 
cess on a job does not indicate intrinsic satisfaction with the work 
activities Questionnaires which show reliably the aspects of the work 
that give one greatest satisfaction would probably yield better criteria 
of vocational interest than those now generally used To be most use- 
ful, such job-satisfaction questionnaires would have to have divisions 
comparable to the independent areas evaluated by the interest ques- 
tionnaire. 


STUDY GUIDE QUESTIONS 

1. Distinguish between motive, drive, and incentive 

2. How may primary and secondary interests be defined and measured? 

3. How can the development of interests in an individual be traced? 
What light do such studies throw on the nature of interests> 

4 What \alue are logs or diaries in the detennination of the strength 
and direction of interests? 
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5. Why do most of the interest inventories try to eliminate the considera- 
tions of ability or interest? 

6 How was the Strong Vocational Interest Blank put together? What 
principal topics does it cover? 

7. Compare the scoring methods of Strong, Kuder, and Thurstone, 

8 Compare the types of items used by Strong, Kuder, and Thurstone. 

9. What use are level-of-mterest scores^ How can they best be obtained? 

10 To what extent do interest questionnaires lend themsehes to inten- 
tional misrepresentation? What methods might ) leld less error of this kind? 

1 1 What typical age and sex differences have been found? 

12. What correspondence between interests and academic success has 
been found? 

13 What evidence is there of interest as an important factor in occupa- 
tional success? 

14 How may ambiguity of items be reduced^ 

15. What are the advantages and disadvantages of using names of occu- 
pations as items in an interest questionnaire? 

16. How well do the various interest questionnaires correlate with each 
other? What reasons can be given to explain the results? 

17. What effect may the forced-choice form of item ha\e on the final 
scores? 



CHAPTER XXI 


APPRAISALS OF 
ATTITUDES 




INTRODUCTION 

There does not appear to be a clear-cut distinction between likes and 
dislikes, discussed in the previous chapter, and attitudes, discussed 
here, but a rough practical discrimination can be made on the basis 
of subject matter and ethics. The subject matter of a preference in- 
ventory is limited to personal activities, whereas attitudes usually 
involve broader questions of policy or values for a group. In in- 
dicating his personal preferences, one is usually asked to state his 
satisfactions regardless of their moral significance. A person may like 
an activity which he believes is immoral. In attitude scales, however, 
a person is often required to voice his approval or disapproval of in- 
stitutions, activities, races, or principles. His personal activities are 
not emphasized as much as his opinion concerning what should be 
done for the good of all. 

A simple type of unsealed appraisal of attitude is a single vote or 
judgment for or against a given act. Voting activities range from 
selecting a beauty queen to approving a far-reaching government re- 
form, Voting is often considered to be a superficial method of evaluat- 
ing attitudes, since many persons cast votes without understanding 
a proposal. Often those who are least informed vote with most as- 
surance, then, a few minutes later, after a dramatic appeal, they are 
ready to change their votes. When persons are adequately informed 
and uninhibited, however, voting is an excellent means of attitude 
appraisal. Practically, voting is designed to decide a present issue. 
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and rating scales are designed to evaluate usual goals or preferences. 
However, there seem to be no important psychological differences 
between voting and the rating scales discussed below. The scales rep- 
resent attempts to refine voting. 

McNemar (1946) in a thorough review of opinion-attitude meth- 
odology indicated that many opinion gaugers were content with low 
degrees of reliability and that veibal expression of an attitude should 
be carefully checked for validity with other aspects of behavior. Gal- 
lup (1944), Cantril (1944), and Link (1943) ha\e published substan- 
tial reports on polling methods and results, and Blankenship (1943) 
has published reports on consumer opinion. 

A great deal of activity is now going on in public-opinion research, 
because samples of public opinion have been found to be of great 
value for sales organizations, managers of political parties, govern- 
ment agencies, social scientists, and policy makers in general. Public- 
opinion polls have become a big business and will probably play an 
important role in national and international affairs for some time. 
They usually employ short series of unsealed items which are made up 
for a particular group and time. However, some opinion research be- 
gins with a “depth” interview. 

Link (1943) and several other workers, using projective techniques, 
have experimented with photographs or pictures. Cantril (1944) 
points out that opinion has four important “dimensions'*: direction, 
intensity, breadth, and depth. Nearly all polls report only the di- 
rection, but the intensity of emotional involvement, the breadth or 
inclusion of many related details, and the depth or foundations of 
tradition and personality are important determiners of the course of 
opinion and action. 

Cantril (1944) has reported in detail on the problem of devising 
questions which will give the most accurate results in public-opinion 
polls. He found that general questions concerning the acceptance of 
or deviation from well-established policies or ideals were less likely to 
show true individual opinion than questions which presented spe- 
cific issues having a personal context. 

With regard to form, Cantril preferred a leading statement with 
three or four choices which do not irritate or becloud the alternatives. 
The 2-choice question has simplicity which makes it most useful when 
it will not force answers into a poor representation. Free responses 
were recommended on small samples, in order to discover what peo- 
ple think are the issues and to secure meaningful alternatives. 
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TYPICAL SCALES 

Among the typical scales in this field are Watson's (1925) Test of 
Public Opinion, Thuistone and Chave's (1929) Scales of Attitudes, 
Allport and Vernon's (1931) Study of Values, and Murphy and Lik- 
ert's (1938) Study of Public Opinion. Two scales which illustrate in- 
teresting techniques are Hoiowitz's (1936) measurement of attitude 
toward the Negro, and Brandt's (1937) analysis of replies to questions 
of personal ethics These scales will first be described, and then typical 
results 'VN ill be quoted. 

Watson’s Test of Public Opinion 

Watson's Test oi Pulilic Opinion consisU of six paits of approxi- 
mately 51) Items each Each part uses a difieicnt method of eliciting 
opinions. "1 he first part is a cioss-oiit test ol 51 items in which one 
IS instiucted to cioss out eveiy woid which is inoie disagieeable or 
annoying than agreeable or attractive 'Ihe second part is called a 
dcgree-of-tiuth tc'st In this a peison is asked to rate 53 items on a 
5-step scale, horn iitteily tiue to utterly lalsc The fust five items are 
shown in Ulus 198 The thud pan i^ called an inleiciice Lest Here 
a paiagiaph is lollowed by seven or ciglit short statements I he stu- 
dent is asked to consider whethei the conclusions follow from the 
paragraph, as shown in Illiii) 199. Prejudice is supjDOsed to be in- 
dicated wdien pel sons chock items which do not logically iollow Irom 
the pai.igiajih The lotirth pait, called a moral-judgment test, asks 
a student to indicate approval or disapproval of instances w'hich are 
described in short paiagraphs (Ulus 200) 'Phe filth jxirt is called an 
aignmcnts test ITc'ic one is asked to disfinguish between strong and 
weak aigunients in snppoit ol a particular question The last part is 
2 i gciie^nlnaiton test which is similar to the second part Watson also 
secured infoimation concerning one's approximate wealth, school- 
ing, vocational and religious experience, and membership in political 
parties and clubs The nietliocl of scoring is to obtain the sum ol all 
points of credit allotted lor statemenis lavonng one point oi view’ A 
geneial picture ol a person’s picjiidice may be had from a gross score, 
and the separate scale scores give a piofile show'ing the strength of 
opinion along the twelve lines of economic, religious, and moral 
values shown in Ulus 201 

The scoring standards weie secured by submitting the items to 
small groups ol judges including teachers, industrial leaders, and 
psychologists Xo items were retained in a preliminary form which 
were not agiecd upon by at least 73 jdct cent ol the judges. In the 
final form, items were excluded unless thev sliowed some plausibility 
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Directions: No one knows just what the American people are thmking. There is 
need to find out just what convictions are most firmly held on some disputed issues. 

Indicate your opinion about each of the statements on the following pages by 
drawing a circle around the one of the numbers m the margin which expresses your 
judgment. The meaning of each number is as follows : 


Mark: 

— 2 If you feel the statement is utterly and un- 
qualifiedly true, so that no one mho had a fairly 
good understanding of the subject could sincerely 
and honestly bditve it false 

— 2 If you feel that it is probably true or true in large 
degree. 

— 2 If you feel that it is quite undecided, an open 
question or one upon which you are not ready 
to express an opinion 

— 2 If you feel that it is probably false or false in 
^ large degree 

^^If you feel that the statement is utterly and 
^unqualifiedly false, so that no one who had a 
fairly good understanding of the subject could 
sincerely and honestly believe tt true. 

Work rapidly, but do not fail to circle one figure in each Ime 


(+ 2 ) 

+ 1 0 

-1 

+ 2 

® 0 

-1 

+ 2 

+ 1 @ 

-1 

+ 2 

+ 1 0 

o 

+ 2 

+ 

O 

-1 


1. +2 

+ 1 

2 +2 

+ 1 

3. +2 

+ 1 

4 +2 

+ 1 

S +2 

+ 1 

6. +2 

+ 1 


0 - 1 -2 
0 - 1-2 
0 - 1-2 
0 - 1 -2 

0-1 -2 

0 - 1 -2 


The churches are more ^mpathetic with capital 
than with labor 

Dancing is harmful to the morals of 3 roung 
people. 

Jesus was more mterested m individual salva- 
tion than in social reconstruction 
To have expenenced business men, who have 
made a financial success in private enterprise, 
hold the public oflSces of the country would be 
better than the present personnel 
The modem laxness in the observation of Sun- 
day is, on the whole, harmful to the best inter- 
ests of the people. 

Foreigners who work in our mines or factories 
should be paid on the basis of the same standard 
of hving which we would set for American homes. 


(Watson, 1 925 By permission of the Bureau of Publications, Teachers College, 
Columbia University ) 


b) being chosen by se\eial peisons among a sample ol two hundred 
The odd-even reliability was appi oxmiately 96 for total scores The 
separate jiarts showed a median odd-even icliabiliiy ot 80, and a 
median correlation with the total scores of .(53 Part 11, Degiee ot 
rriith, showed the liiglie&t coirclaiion 94, with the total, and also 
with total scores ot the remaining paits, 42. The fiist part, Gioss-Out 
Test, showed low coiielalions with the othci parts, .11 Watson con- 
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Directions: Mere facts may mean different things to different people It is often 
important to know just what people think certain facts mean In the follow'ing 
pages you will find several statements of fact and, after each, some conclusions 
which some people would draw from them. 

Put a chedk (\/) in front of each conclusion that you beheve is fairly based upon 
the fact as given here. Do not assume anything else than the emdence gtven in the 
statement here, with all its terms understood You are not to consider whether 
the conclusions are right or true in themselves, but only whether they are nghtly 
inferred from the facts given in the statement. You may check as many as you 
beheve to be perfectly sure and certain. Do not check any merely probable infer- 
ences 

Ilxainplc 6.S00 students recently attended €*1 conference m whuh the questions 
of race relations and of possible attitudes tu^^ard war were discussed, these being 
the problems the students felt to be must vital today. 

The students were all pacifists 

The students were all militarists 

The students came from all sections of the country 

Some students arc interested in the ticatmenl of Negroes and Japanese in 

this country 

Some students felt war was w rong 

j/.lhe question of attitudes toward war is considered by many students to be 
important enough to be discussed 

I Statistics show that in the United Stales, of 100 men starting out at an age of 
2S, at the end of 40 ye.irs one will be wealtlij and 54 will be dependent upon rela- 
tives or chanty for support 

1 The present social order cheats the many for the benefit of the few 

2 Ihe average young man, under present conditions, cannot count on being 

w ealthy at the age of 65 

3 Most men arc shiftless, lazy, or extravagant, otherwise they would not 

need to be dependent 

4 The one man is living upon luxuries ground out of the bones of the mass of 

common people 

5 Some day the workers will rise m revolt 

6 None of these conclusions can fairly be draw n 

(Watson, 1925, By permission of the Bureau of Publications, Teachers College, 
Columbia University ) 

eluded that each part of the test was reliable enough to use in attitude 
measurement and valuable because ol its special contribution 
Coriclations of intelligence and reading test scores with total prej- 
udice were nearly zero. Part III, Inference, and Part IV, Moral Judg- 
ment, also show zero correlations witli two group intelligence tests. 
Watson believed that the incorrect irilerences weie usually made 
not from failure to reason correctly among these college students, 
but from strong convictions which overcame intellertual conclusions. 
Illustration 201 shows die results of applying the Test of Public 
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ILLLS 200 ^^ORAI jri)GMi:\T TMST 

Directions. Most actual judgments of right and wrong liave to be made in con- 
crete instances Mere general pnnciplcs arc not enough 
In the following pages you will find several instances upon which the moral 
judgment of individuals would differ Read each carefully You may assume 
each fact as stated Then look at the alternatives suggested be^ow it Place a 
check (>/) in front of the one is iili which you most fully agree. Tf you do not fully 
agree with any, check the one which conies nearest to expressing your opinion about 
the incident 

Example' A man stumbled into his house, drunk with bootleg whiskey. He 
smashed up some of the furniture and beat his wife .ind children Then he stole 
some money from his small son’s bank in order to buy moie w hiskey 

His action is w of thy of approval 

.^.Ihe people who tolerated the sale of bootleg whiskey were in some degree 
responsible 

The occurrence is worthy neither of approval nor of disapproval It is quite 

indifferent 

It would he desirable to prevent such a thing happening again, if possible, by 
cstabhsliing a better type of character in the man himself. 


I In 1793 the government of the United States iccognired the young French 
republic, and President Washington received Gen€t, the French ambassador. At 
the time Pans ran red wuth blood, the jails were full of the nobles who had been 
dnven from power, and the government was in the control of a few high-handed 
dictators 

1 Washington was nght in recognizing this government. 

-- 2 It made no difference whether recognition was extended or not. 

_ 3 Washington was unwise The government should not have been recog- 
nized under such circumstaziccs 

4 If it was a government which the majonty of the French people really 

wanted, then it should have been recognized. 

(Watson, 1925. By permission of the Bureau of Publications Teachers College, 
Columbia University ) 

Opinion to two groups of normal school students, one in New Jersey 
and the other in Wisconsin. In geneial the scores are similar, al- 
though the New Jeisey students show consistently more preference 
for economic conservatism, and strict moral standards than the mid- 
western group. In other reports marked differences have also been 
found among gioups of persons who would be expected to show 
marked religious or economic preferences. Watson's Test of Public 
Opinion has in the past been used to appraise the effects of courses 
of instruction, but some of its items are now out of date. 

Thurstone and Chave’s Attitudes Scales 

Another type of attitude scale w^as constructed by Thurstone and 
Chave (1929), who designed a single scale for attitudes tow'aid an in- 
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IPreJudteft In Agreeaeat With; 



Key 

■ Indicates wore prejudice on _ 
part of students in a Vieeonsln 
normal school. 


_ Indicates more prejudice on 
part of students la a New 
Jersey nonnal school. 


(Watson, 1925 B> permission of the Bureau of Publications, Teachers 
College, Columbia Uni\ersity ) 


stitution, such as the church. They first procured approximately three 
hundred statements which showed a wide range of approval or dis- 
approval of the institution in question. The method of selecting and 
scaling the items is discussed in Chapter XVI Illustration 17 shows 
the odd-numbered half of the items in their scale for measuring atti- 
tudes toward the church Similar scales have been made up for evalu- 
ating attitudes toward a fairly large number of institutions and racial 
groups. The advantages of this type of scale are (a) it insures a wide 
range of opinion among items, (b) the items have been selected to 
nearly equivalent steps m a scale, and (c) items which are irrelevant 
or ambiguous have been excluded. A person*s score is simply the 
median scale value of the items which he checks. 


Murphy and Likert’s Study of Public Opinion 

The work of these investigators was carried on over a period of 
five years among college students in several universities. They sought 
to determine what attitudes were held by these groups and what ex- 
penences had determined these attitudes. They used three sorts of 
appraisals of attitudes- self-rating inventories, reactions to pictures 
and paragraphs, and autobiographies They also secured measures of 
amount of specific information about controversial issues and records 
of scholastic success. 

The self-rating inventories called attitude scales were constructed 
to evaluate opinions on: 
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1. Internationalism: war and peace 

2 Impeimlnm the tisc of force by great powcis to maintain or extend 
their enipiics 

3 Negro activities soci.il, economic, political, and ecIucatioii.il 

Some Items foi these scales wcie selected iioiu cpiestionnaires which 
had been e.xtcnsively used, others weie designed to sample opinions 
in the thicc fields listed abo\e In the selection of items the following 
five Cl ireria weie to be applied 

1 All statements wcic to be an cxpicssion of desiicd achievements, 
not meielv statements of lart, since persons with opposing attitudes 
often agree on cjiiestions of fact. 

2. Clear statements in\ol\ ing only one issue were to be used 

3 Each statement was to be w’oided so that the modal reaction to 
it would be near the middle of possible i espouses 

4 Approsal of about one half the statements and disapj)»oval of 
the other half would cxpiess the same attitude 

5. All the items in one scale w ere to measure the same attitude This 
was to be achieved in pait by requiring e«ich item to show* a high coi- 
relation with the total score, median bO 

Total scores were the sums of \alues assigned to \arious degrees of 
approval oi disapproval of each srateniem 'Ihe letest reliabilities 
of the three scales when applied twice, alter a 3-week inters al, w’erc 
found to be approximately 85, using small samples of college stu- 
dents 

The median conelations between the scales were: in tci national- 
ism with impel lalism, 63, iiitei nationalism wnth Xegio activities, 
40, and imperialisni wuth Negro activities, 31 A fourth scale on 
economic liberalism, which w'as not so widely used, dealt w'lth organ- 
ized labor and the distribution of wealth Scores on this scale cor- 
related .40 with intei nationalism, 39 with impeiialisni, and 30 wdth 
attitudes toward the actisities of Ncgioes 'Ehe authors believe that 
then scales demonstrate "high generality rather than specificity in 
social attitudes ” They also feel compelled to admit the presence of 
a gcneial ladicahsm-consci vatism factor as an explanation of the in- 
tercorrelations among these scales and data from autobiographies. 

Films, photographs, and paragraphs depicting scenes of violence 
or force were also extensively used. Illustration 202 gives the ques- 
tions that were asked aftci a picture of the wrecked automobile of a 
worker who refused to go out on strike had been showm. In addition 
to similar pictiucs, three short motion pictine films were used and a 
number of paragraphs 'Ihe results indicated that attitudes toward 
specific events could be measured reliably, but low correlations were 
found betweeir items. 1 his fact led to the conclusion that specific fac- 
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ILLUS. 202. APPRAISAL OF ATTITUDES BY PHOTOGRAPH 

(Questions adiced after viewing photograph No. 14, showing a wrecked automobile 

of a workman named Langlois who refused to go out on strike) 

1. Describe briefly in outhne form your reaction to this photograph 

2. Indicate, by checking, how this picture affects you Double check those which 
are e^peciaUy intense. 

Excites Angers (irntates or enrages) 

Depresses Amuses 

Thrills Disgusts 

Bores Others 

Interests . .... 

3. In this situation, with whom do you sympathize? 

4. What do you like or dislike m this photograph? 

5. Why? (Answer bnefly.) 

(Total response to this photograph scored 2 ) 

(Murphy and Likert, 1938 By permission of Harper and Bros ) 

tors III each situation were large as compared with a general factor. 
Similar results were lound for single items iii the attitude scales 

The Bogardus (1925) Test of Social Distance, shown m Ulus 203, 
was also applied by Afurphy and Likert It yielded specific scores ol at- 
titudes towaid paiLicuhii races and a geneial tolerance score whose 
odd-even reliability was found to be .88 among small groups ol col- 
lege students. This high coefficient was taken to indicate a high de- 
gree of generalhation of attitudes toward persons in national and 
racial groups who aic regarded as outsideis. 1 he total tolerance score 
correlated appio\imately .68 with the separate scoics from the ques- 
tionnaires on internationalism, attitudes toward the Negro, and 
economic liberalism. 

All of these opinion scores show onlv zero relationships w'lth meas- 
ures of specific infoiination in the same fields or with intelligence 
test scores. However, a positive relationship betw'een giades in college 
and ladicalism scores on the opinion scales was discovered Scholar- 
ship was more highly related to a radical stand for peace than to the 
other attitude scores The authors believe that high scholarship usu- 
ally involves much reading of recent literature, which happens to be 
largely radical in nature The only other factor which corresponded 
with attitude scores was attitudes of paients The authors conclude 
that the appraisals of attitudes dealing with public issues can best be 
studied from biographical material supplemented by systematic in- 
\entories 

Attitude toward the Negio 

Horowitz (1936) investigated attitude toward Negroes by using 
tlirce kinds of tests w^hich were administeied to boys Jrom the kinder- 
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ILLLS 203 BOGARDtJS' II SI OF SOCI\L DISl'VXCF, REMSED 
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(Murphy and Lilcrrt, 1 938, p 133 By permission of Harper and Bros ) 


from my country 
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garten through the eighth grade in various communities in or near 
New York City and in Georgia and Tennessee. Two of the tests used 
a page of 12 pictures of boys’ faces, 4 of white and 8 of Negro; all 
judged to be pleasant by a group of adults (Illus. 204). On one test. 


ILLUS.204. PHOTOGRAPHS OF WHITE AND NEGRO BOYS 



(Courtesy of Horowitz, 1936. By permission of the Archives of 
Psychology,) 


called “Ranks,” the instructions were to “Pick out the one you like 
best, next best, next best, and so on, until all are ranked.” The score 
was the sum of the ranks assigned to the white boys' pictures; the 
smaller the score, the greater the preference for whites. On another 
test called “Show Me,” the instructions were: 

1. Show me all those that you want to sit next to you on a street car. 

2. Show me all those that you want to be in your class at school. 

3. Show me all those that you would play ball with. 

4. Show me all those that you want to come to your party. 

5. Show me all those that vou want to be in voiir jrancr. 
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6 Show me all those that you want to home with \ou for lunch. 

7. Show me all those that sou want to sit nc\t to in the mo\ics. 

8 Show me all those that \oii would go swimming with 

9 Show me all those that sou*d like to li:i\e for a cousin 

10 Show me all those that \ou want to be captain of the bidl team. 

11 Show me all those that \oii want to li\c nc\t dooi to )ou 

12. Show me all those that sou like 

The score was the per cent of total selection which w’cre white 
boys The higher the pet cent, the gieaier was the preLeience for 
whites. 

A thud test consisted of thirtv-niiie photographs show’ing activities 
of playing inarbles, choosing sides for baseball, hand wrestling, sitting 
outdoois, playing a piano, listening to a radio, playing checkers, eat- 
ing in an ice cieain pailoi, eating dinner in a home, and common 
situations in lavatoiy, woikshop, museum, libiary, and school) oom. 
Each situation was posed twice, once by four white boys and again 
W’ith a Negio boy substituted foi one of the whites Each picture was 
observed in tinn and the examiner asked, “Do you want to join in 
with them and do wdiat they’re doing along with theni^” The scores 
were the diffei'ence between the number of times all-white groups and 
groups wuth one coloied boy w'cic chosen "J he results of each test 
proved to be fairly reliable when checked for retest consistency. The 
general trends ior New' Yoik City whites arc shown in lllus 205 Here 
the lanking test show'ed the most 
prejudice, and this was true for 
all grades I'he Show Me Test 
indicated little prejudice at the 
fifth year, but a rapid increase 
until aj^proximately the eighth 
year. The Social Situations Test 
show'ed little prejudice at the 
hfth year, and a slow me i case to 
the I OUT teen th year Intercorie- 
lations betw'een these tests in- 
creased with age. The New York 
City group show'ed responses 
similar to those of the Southern 
groups The degree of prejudice, 
as sliowii by a tendency to exceed chance scores, was greater m the less 
specific test situations Hoiow'itz concludes that prejudice is a socially 
developed lesponse vshicli is derived from various souices. The most 
important source, at fiist, is piobably tiie attitudes of friends and 
relatives. Later, the attitude may be the lesult of specific experiences. 


ILl rs 20-) CROW I U OI \TTI- 
TUDES TOWARD THF NEGRO 


DeRree of 
Prejudice 



(Hoiouit/, 1936, p 2‘) B\ permission of 
the Lditor, Aichues of Ps^dwlogy.) 
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Reasons for Acts 

Direct approaches are inadequate when the subjects do not wish to 
reveal their true attitudes. Brandt (1937) made an approach to the 
analysis of replies to questions of ethical significance which is inter- 
esting because it is somewhat indirect In a group test situation the 
following instructions were read. 

In each of the situations gi\cn below, a person ^sould ha\e to make a 
choice between two possible actions Foi each situation write clown nil of 
the poiiiLs he should coiimcIlt bcfoie he cleciclc'. which to do Remember we 
are not in the least bit interested in which decision >oii think is the right 
one, or which one you would make You arc to concern youiself with writ- 
ing clown only the Ihtn^^ that are to be considered in making the decision. 

At this point the expciiincntcr gave orally an example (losing ones 
fountain jien — to buy a new' one or not). He then discussed with the 
group the particulai things involved in a consideration ol the situa- 
tion presented. Samples of the situations used arc. 

1. Propert) 

ci Stealing' finding an unlocked car on the street, to diivc it around 
awhile and return it or lease u alone 

6. Destinciion rinding some things one wants in a maga/inc in the public 
library, to tear out some pages or not 

2. Pci sons: 

a hi lends finding out about someone who has been doing serious injury 
to otiicr people; to report him oi not 

b Group loyally under what circumstances would one join a club^ 

3. Authority 

a Parent trying to get along with the family or leaving home 

b. Law* obeying a law or school ruling even thougli you don’t see the 
reason for doing it 

c Religion: gcjing to Sunday movies or not 
4 Social. 

a Conventions having to do something that injures your physical health 
in order to be one of a group 

Indulging in one’s owm tastes for clothes regardless of what others are 
wearing 

h Superstitions, finding a ladder in one’s path, to go under it or avoid it 

The answers w’ere classihed into six groups on the basis of type of 
goal. These are illustrated in Ulus 206 for the item, "Indulging in 
one's own tastes for clothes legardless ol wdiat others are wearing.” 
The boundaries of these classes are doubtless somewhat vague, but 
it seems piobable that iii a moie extended and refined study a high 
degree of uniformity would be found among judges in classifying the 
answers. 
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ILLUS. 206. BASES OF JUDGMENTS 

For the item indulge in one's tastes for clothes regardless of tv hat others are 
wearing”: 

I. Self -1 egard (17 per cent) 

Will I be looked down upon? 

Will people laugh at you? 

2 Parental apptoval (8 per cent) 

Do my parents think them OK? 

3 Friends* approval (7 per cent) 

What do my friends think? 

4. General welfare (8 per cent) 

\\c<ii the best \ou can 
Clothes do not make the man 

5. Olijccnvp or pinctical considciations (30 per tent) 

Does the clothing iii^ Is it dppiopn.itc for the vvedilicr’ Where are you living’ 
Whdl IS the conip.u.iiivc lost and dindlnhis- 
6 Social iiisLiiiiLions (21 per cent) 

Who die the people >ou associate with’ Hd\c I social amlntions’ Do )ou like 
to dress like other people’ 

(Aftci Biaiult. 1937 B> peiinission of the rniveisiiy of Jowd, 

Studit^s tri Child U&ffare) 

The pci cents in the paieiithescs m Ulus 206 are the distribution 
of all answeis given by all pci sons to all the items Brandt lourid that 
individuals vaiicd a great deal from this avciage distribution of 
answers. Some subjects emphasized their self-regard, others parental 
approval, and otheis social orientation This is a valuable diagnostic 
technique which should be nioic widely used It is interesting to note 
that moial or icligious considciations were neaily lacking in this 
group The answers would doubtless vary if the study were conducted 
in a Sunday school. 

Responses to paiticular questions showed that there v\as a good 
deal of hedging or evasion or making of particular exceptions for 
unethital acts Thus, most of the group were vvilling to take an un- 
lof ked car for a spin on such considerations as* “Is there sufficient gas 
in the tank^ Can they drived Can they get away with it^ What are the 
chances of wrecking the cai^ Is there an uigeiU nced^ Who is the 
owncr> ^VT11 he care-” Considerations of this sort were most frequent 
on items which seemed to be or to have been of immediate concern to 
the subjects 

Allport and Vernon’s Study of Values 

The Study of Values Questionnaire is interesting in that it attempts 
to appraise attitudes in the six helds which weie described by 
Spranger (1928) The test includes 120 items, distributed equally be- 
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tween the six fields. An inspection of the test shows the following sorts 
of items classified as 

1, Theoretical' discovery of natural laws, mathematical relations, and 
scientific facts 

2. Economic' activiu in real estate, finance, industrial development, and 
vocational training, and practual applications in general 

3 Aesthetic indulging in artistic appreciation or composition in poetry, 
literature, music, daiuing, architecture, and scenery 

4 Social* responding to others' needs s\ith unselfishness and sympathy as 
shown by chan tv, freeing of slaves, hospital and social serv'ice work, and 
geneial friendliness and co-operation 

5. Political ni.inagmg gos erninental and legal affairs, curtailing charity, 
aggression in war, debating, playing on athletic teams, organizing, 
holding a seat in Congress, exploration of little known pazts of the 
world, and accjuiring piofessional and social prestige 

6. Religious abolishing war, layiiigup treasures m heaven, being reverent 
in a church, believing in God, comparing religious faiths, and evaluat- 
ing life as a whole 

The test has tw'o parts All the items in Part I require that a prefer- 
ence be expressed between two fields of activity (Ulus 207) These 
Items combine judgments of Yes or No with ratings from 0 to 3, to 
show amounts of preference. The first item demands a judgment be- 
tween discovery of scientific laws and applications of scientific laws; 
the second item, betw^een aesthetic creation and social compatibility 
The Items in Part II require that four choices be rated in order of 
their appeal (Illus. 207) In item I the first choice emphasizes social 
values; the second, economic; the third, religious; and the fourth, 
political Parts I and II are combined into a final score for each 
field, w^hich may be expressed in profiles, such as those in Illus. 208. 
Here the mean scores of a group of sixty-one engineering students 
show that their greatest interests lay in theoretical and economic 
activities, and their least in the aesthetic and religious The highest 
interests for a group of eighty-one missionaries of both sexes are 
shown by this figure to be in religious activities, the next highest in 
social, and the lowest in political and economic activities A fairly 
large number of studies using this questionnaire have reported dif- 
ferences of the same kind between various professional and social 
groups, 

A statistical evaluation of this form has been reported by Cantril 
and Allport (1933), who reviewed fourteen published studies of this 
questionnaire The original means were found to be correct for an 
additional group of 2,755 subjects The range of scores varies from 
a standard deviation of 5.5 in social values to 9.7 in economic. The 
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ILLLS 207. A STUDY OF VALUES 
P.\RT 1 

Directions A number of controversial statements or questions Mith two alterna- 
tive answers are given below Indicate your personal preferences by writing the 
appropriate figures in the right-hand columns, as indicated 

{^) (W 

If you agree with altcmati\e (a) and disagree w ith (/>!, 
write 3 in the first column and 0 in the second col- 


umn, thus S 0 

If you agree with {b ) , disagree with (a), write 0 3 

If >ou have a flight preference for (u) over (ii, w’ritc 2 1_ 

If you have a slight preference for (ft) over («), write 1 2 


Do not write any other combination of figures after any question except one of 
Ihesc four 

There is no time limit but do not linger long over an> one question or statement, 
and do not leave out any of the questions, unless you find it really impossible to 
make a decision 

(a) I (h) 

1 The mam object of scientific research should be 
the discovery of pure tnitli rather than its prac- 
tical ajiplications (a) Yes , (ft) No 

2 Do you think that it is justifiable for the greatest 
artists, siK h as Jiecthoven, Wagner, Byion, etc , 
to be selfish and negligent of the feelings of others^ 

(fl) Yes, (ft) No 

P\RT II 

Direcltons * Each of the following situations or questions is followed by four possible 
attitudes or answers Arrange these answeis in the order of your personal prefer- 
ence from first to fourth b}' w riting, in the left hand margm, 

1 beside the answ cr that appeals to you most, 

2 beside the answer which is next most important to you, 

3 beside the next, and 

4 beside the answ er that least represents your interest or preference. 

You may think of aiisw'ers which w^ould be preferable fiom your point of view 
to any of those listed It is necessary, how ever, that you make your selec t ion from 
the altemanvcs presented, and arrange all four in order of their desirability, gucsv 
mg when your preferences are not distinct If you find it really impossible to 
guess your preference, you may omit the question 
1 Do you think that a good guvemment should aim chicfiy at — 
a More aid for the poor, sick, and old 
ft The development of manufacturing and trade 
c Introducing more ethical principles into its Policies and diplomacy 
d Establishmg a position of prestige and respect among nations 

(Allport and Vernon, 1931 By permission of Houghton Mifflin Co.) 




610 


DYNAMIC PATTERNS 
ILLUS 208. PROFILE OF VALUES 



Profile of Values 

Mean Scores of 61 Engineciing StiidciiLs and 80 Missionaiies 

(Arranged fiom Allpoit and Vernon, 1931 R) pemii&Mon of the Jouinal of 
Abnormal and Sotial Psychology) 


repeat reliability is leportecl for eighty-four students over one hun- 
dred days as follows. 


Religious 

87 

Politual 

76 

Aesthetic 

8G 

Theoietical 

68 

Econoimr 

79 

Social 

50 


The authors point out that tliese figures show that the test of social 
values IS less discriminating and less consistent than the other tests. 
The unsatisfactory results from the social-values test aie attributed 
to confusion in defining the term social, svhich led to the inclusion of 
several independent traits under this heading, and also to ambiguities 
in interpretation of certain items 1 he authois question the existence 
of Spranger’s "social type” and conclude (p. 272) 
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On the theoretical side the evidence from recent applications of the Study 
of Values must be interpreted as establishing the Values, with the exception 
of social, as self<onsibtent, pervasive, enduring, and above all, generalized 
traits of personality Several experiments demonstrate a clear relationship 
between values and conduct. They show that a person’s activity is not de- 
termined exclusively by the stimulus of tlie moment, nor by a merely tran- 
sient interest, nor by a specific attitude peculiar to each situation which he 
encounters The experiments prove on the contrary that general evaluative 
attitudes enter into various common activities of everyday life, and in so 
doing help to account for the consistencies of personality 

According to Allport and Vernon, proof for the existence of these 
generalized attitudes lies in a number of findings, set forth as follows: 

1. The internal consistencies of the various subtests Items were selected 
which showed the highest relationships between item score and total sub- 
test score. 

2. The diversity of items in a subtest. The authors tried to select items 
which would test as many different applications of a generalized interest as 
possible. 

3. Correlations between the Study of Values scores and activities, such as 
reading a newspaper, showing interest in clothes, securing high grades m 
college studies, attending church, speaking, handwriting, and artistic activi- 
ties. Such correlations range from approximately + 25 to +.66 on small 
groups of students. 

From an examination of these findings, however, the existence of 
these five general attitudes does not seem to the writer to be well 
established. The internal consistencies, which are generally low, 
would in any case merely show a coincidence of pattern among the 
persons tested. Such coincidences may occur from a number of factors 
rather than one of the five general attitudes selected The moderate 
correlations between scores on the Study of Values Test and various 
activities might also be explained as resulting from a larger or smaller 
number of factors. Moreover, the test items for a single field appear 
to many to be quite independent. Thus, in the field of economics, 
there seems to be little if any connection among interests in financial 
transactions, in vocational training programs, and in problems of 
practical engineering. In the field of politics, which is more accurately 
described as interest in power, there seem to be small similarities 
among such varied activities as debating, playing on athletic teams, 
curtailing charity, aggression in war, and exploring strange lands. 

Furthermore, an inspection of test items reveals an overlapping of 
fields of interest, a fact which makes their identification difficult. For 
instance, the writer has found that many persons cannot distinguish 
between a search for truth in the theoretical field, an attempt to evalu- 
ate life in the religious field, and appreciation of true art in the 
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aesthetic field. I'hcte is also a maiked similarity between unselfish- 
ness in the social field and ethics in the religious. 

Indirect Appioachcs 

All of the studiei. reported above, except that of Watson, attempt 
to appraise attitudes b\ direct apjiioaclies, such as by asking a per- 
son to ^^hat extent he agrees s\ith a statement A diiect appioach al- 
lowb a person to labily puiposcly his an^wcis Theie seem to be 
many situations wheie a person pielers to conceal his leal ieelings, 
because he belies es iheii disclosure uoulcl get him into iioublc or 
cause him to lo^e a Iriend 7 he studies repoi ted below are noicwoi thy 
in that the metliods used attempt to disguise the puipose ol the test 

Hammond (1918) appioachcci attitude mensinement by using an 
infoiination test in \\liich one w.is asked to choose between two 
answ’crs In one group ol items (Hammond, 1918, p H9) both answers 
w’cie placed about ecjually distant Irom the light answ'er, as. 

The a\eni»e weekh wage ol war workers in 1915 was 
(1) 'S37 (2) S57 

In anothei group of items the two choices indicated extremes but 
the until was indetciiiiinate, as 

Russia’s icmosal of hc.is v industry from Austria was 
(I) legal (2) illegal 

Hammond called both of these types of items nonfactiial, because 
the tiue answer did not appear. Two tests were prepared, one con- 
cerning labor-management attitudes, and the othci attitudes toward 
Russia Hammond applied these tests to a businessmen’s luncheon 
club and to a gionp of clerical and semipiolessional employees of 
a large labor union. The ciitical ratio ol the difference between the 
means was about 12 for each test An item analysis re\ealed that 
only five of the foiiy items used on the tw'o tests failed to show a 
significant difference between the two gioui>s The split-half reliabil- 
ity coefficients were approximately 80. 

Hammond also cxj)erimented to see if the same tests would work 
equally well as attitude qucstionrianes, wath iiistructions which ex- 
plained that the nonlactual items had no correct answers For a large 
group of college students the test concerning laboi-manageiiient facts 
gave almost the same results when used as an attitude questionnaire 
In the case ol the test concerning Russia, how*e\er, there was con- 
siderable variation between the infoimaiioii test and the attitude 
cjuestionnaire results Moreover the attitude-questionnaiie lesiilts 
yielded much lower split-hall reliabilities than the inlormation-test 
results Hammond believed that the inlormation type ol test gave a 
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better indication of attitudes bec.iuse the purpose of the test was 
somewhat disguised 

Another indirect approach to the appraisal of attitudes tow'ard 
another person is that reported by Stogdill (1^15)), who used data 
from four naval organizations In liis pioceduic each officer was asked 
to name those with whom he spent the most time — his assistants, his 
associates and superiors, or those in other dcpaitments Lastly, each 
officer was asked to consider lists of names of officeis and to rank them 
in order accoiding to the time spent In making these estimates each 
officer was asked to make the results t\ ])ical of a usual working month. 

The data w^ie plotted *ind analyzed lo show' working relations 
among the membeis of each oigan/ation Consideiablc vauation w'as 
found among oigani/ations and among pci sons in similai positions. 
Deviations from foimal lines of connnuiucation w'cre most maiked 
in oigaru/ations w'heie the commanding oflicei had e\prcssed an 
active interest in “cutting red tape” Deviations among individuals 
seemed to indicate personal likes oi dislikes, Init the total amounts 
of time spent had little significance as indicatois of preferences 

Factorial Analyses of Attitudes 

The application of multi pie- factor analyses to data of this sort 
may be of value in discoveiiiig undcrlvmg pniic^iiis Such an analysis 
was made by Luiie (1937), using a 144-item test similar to the Study 
of Values 1 est in content aiicl construction. 1 he test sheets w'ere 
given to six hundred students, Init only 203 (128 men and 7? w'omen) 
returned them complete enough to be used. A 7-step rating scale, 
ranging from complete rejection to comi^letc acceptance, was used 
for each item 'I’wenty-four scoies vvcic seemed foi each person by 
subdividing each of the six mam divisions into foiii smaller ones on 
the basis of (a) present interests, (b) ideals or stanclai els, (c) preferences 
for associates or famous persons, and (d) beliefs or opinions. An anal- 
ysis of these 21 scores by Thiirstone’s metliod resulted in the seven 
independent factois tentatively iiamecl. 

1 Social or all lilts iK fiicnclly toleiant 

2 Philistine aggressive, utilitarian, anti-cultural 

H Theoretical iiilerest in science, critic alncss 

4. Religious ethical doctrine and practice (not mystical) 

Open-minded liberal vs conservative 

6 Piaclical pressure to do something 

7 Aesthetic supeificial li}) service lo culture, negatively related to aes- 
thetic ideals 

Lurie states that the first four of these factors may be taken as 
stable categories which cover all six of Spianger*s types The philistine 
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factor is found to be large in both the economic and the political 
fields and also laige, but in a negative way, in the aesthetic ideals. 
Reports b\ other imestigators support these fiiiciings about the iiiter- 
correlations among scoies on the Study of Values Tests Lurie be- 
lieves that the last three lactois aie temperament traits or adjustments 
which cut across fields of inteiests, and which were not controlled m 
setting up the test Lurie’s findings also include sonic low correlations 
among scores which weie supposed to measure similar interests 

The fact that Lurie fouiicl evidence for a basic imitaiy social oi 
altruistic interest, w'liereas Allport and Veinon did not find it, needs 
an explanation 1 he evidence is not at hand for a complete explana- 
tion, but the different findings must have resulted fioin a dilleient 
sampling, either of interests oi of persons It may be ihat in Lurie's 
form a larger number of items clearly difTeientiate between altiuistic 
and nonaltruistic activities than in the Study of Values Test. Or 
it may be that Lurie's subjects wcie actually more variable in altiuis- 
tic tendencies than those of Allport and A'ernon, and hence show’ed 
distinct indisidual differences in this field A combination of both 
explanations may be nearer the truth TIic results nicely illustrate 
the point that the statistical factois arc always i elated to the paiLicu- 
lar items used and the paiticular population tested 

Whislei (1934) designed a set of questions which w'cre intended to 
evaluate “generalized attitudes " He avoided the use of specific-situa- 
tion Items such as arc found in neail) all the other inventories dis- 
cussed, believing that a few questions, such as those in Ulus. 209, 
would yield as adequate a sample of attitude as a larger number of 
specific Items. A factoi analysis of thirty-one items was made, using 
126 undergraduates Six factors were reported and identified from 
the items which had the largest loadings* (a) acceptance ol conven- 
tional ethical principles, (b) enjoyment of momentary pleasure, (c) 
interest in conflicts and controversies, (d) desire to be an effective 
agent, (e) participation in casual social relations, and (/) criticalncss 
and inteicst in the truth 

The identification ol these factors is not conclusive, owing to the 
different interpretations which may be placed on an item. Thus, Item 
J6, “having a standard of goodness by which plays are judged," may 
be mterpicicd as (a) having a moral standaid ol goodness, or (b) the 
possession of any standaid, or (c) referring to partuulai types of plays, 
or (d) referring to the rendition of a pla^ lathei than to its consttuc- 
tion. Various interpietatioiis of othei items aie probable Less ambig- 
uous statements wrould allow' more arcinatc intcipreta Lions of the 
results Yet Whisler's approach is an interesting one which has given 
results different liom those of other w'oikeis. 
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ILLUS 209. QUESTIONS ON PERSON -VL ATTITUDES 

Insirtidions: Take the words used as roughly indicating a three-fold scale into 
which responses to the question may be put. The scale refers to your judgment 
of yourself, not to what you thmk other people’s judgment may be. Consider 
ea<i question carefully, but if in doubt as to the proper checkmg, guess. 

Abbreviations used* M, much or frequently, S, considerable or sometimes; L, 
little or rarely. 

1. How much would your liking for an acquamtance of the same sex be affected 
by whether that person had a genuine liking for and interest in children, or was 
indifferent? 

M S L 

2. How much eiijoyment do you get out of doing or sa 3 ang things which are quite 
shockmg to most of the people who see or hear you? 

M S L 

3. How much change has there been in the last two years in the type of people 
you prefer and seek to associate with? 

M S L 

4 How much enjo 3 rment do you get out of working with materials or makmg 
things with your hands? 

M S L 

5. To what extent, in general, is your evaluation, judgment, and opinion of an 
acquaintance of the opposite sex based on speculation and thought as to how 
satisfactory the person might be as husband or wife? 

M S L 

6 To what extent are you interested in politics? (That is, — local, provincial, 
national, or international politics) 

M S L 

{25 more items) 

(Whisler, 1934, p. 285. By permission of the Editor, Jottmal of Educational 

P^hology,) 

Carlson (1934) reported the results of applying five of Thurstone’s 
attitude scales to 215 seniors at the University of Chicago. The cor- 
relations between measures of intelligence by a group test and atti- 
tude scores were: 

Prohibition 036 Communism .330 

God —.191 Birth Control .211 

Pacifism .402 

These results lead to the conclusion that the more intelligent stu- 
dents are likely to be more liberal in attitudes toward these issues 
Carlson also found that a multiple-factor analysis of these five atti- 
tude scales and intelligence test scores yielded three factors. These 
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were tentatively named intelligence, radicalism, and religiousness by 
an inspection of the various loadings. 

The question of the existence of basic general attitudes, such as 
those named in the three studies just examined, will lead to useless 
controversies unless careful definition and unbiased procedures are 
used. Apparently, if one is allowed to select some observations and 
ignore others, overwhelming evidence can be secured supporting al- 
most any hypothesis about the nature of attitudes Furthermore, test 
items are carefully chosen to represent a certain general attitude and 
if a random selection of persons is tested, that general attitude will 
usiiallv seem to be suppoi ted b\ an analysis of the test results In order 
10 sccuie a complete picture ol basic opinions and their lelationships, 
items nuist be iiu hided ^shich clearly represent all possible attitudes 
The results ol applying such a tost to laige normal populations w’lll 
make possible a more adecjuare analysis of attitudes Those factors 
desciibcd thus far may be found to be unique, oi they may be com- 
bined with other patterns or subdivided m more essential traits. 

PRACTICAL RESULTS 

7’he strength of various attitudes among students employees, or 
voters IS a major concerm of educators, employers, advertisers, and 
statesmen. A large share of tlicir tune and effort is spent in trying to 
develop particular attitudes in their commuiiiiies Measuring in- 
strurnenis vshich may give accurate pictures of changes in attitudes 
are therefore in great demand and the held is rapiclly developing 
Some studies of nation-wide scope have been of JiuJe value because 
they neglected to secuie measures o( a representative group of people. 
Carefully controlled studies winch measure the same persons twice 
arc usually limited to a lew highly selected students, lienee then 
results aie also of hnuied significance. The sampling of affiiudcs of 
large populations requiies a great deal of careful planning and hard 
work Sampling techniques aic discussed in Chaptci 111 

Modification of Attitudes 

I'he effects of propaganda have been studied in a fragmentary 
fashion by a number of investigators Generally, propaganda tech- 
nicjiies include dissemination of reading matter, oral arguments, 
pictures, and films of actual scenes Although many efforts at prop- 
aganda combine several of these techniques, several studies have 
tried to evaluate them separately A few studies will be reported to 
illustrate the chief results and some of tlic difhculties which have 
been encountered in attempting to measure changes in attitudes. 
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Modifications Due to Actions or Arguments, Modifications of 
attitudes in line with the attitudes of instructors have been reported 
by Manske (1936) and Kroll (1934). Manske, using Hinkley's scale 
of attitudes toward the Negro, reported that among thirty-two classes 
of high school students, eight showed slight changes opposed to 
teachers’ attitudes during ten “non-indoctrinating” lessons taught by 
one of sixteen teachers. This study seems to indicate that teachers 
are able to be impartial in presenting material of this sort Kroll 
measured the result of one semester *s instruction in English history. 
He found that when the teacher held a radical point ol view, scores 
on Harper’s scales of social attitudes changed toward radicalism 
among 183 high school boys, but that when the teacher took a con- 
servative position, the boys showed no reliable changes toward con- 
servatism. 

A number of studies report changes in attitudes due to purposeful 
instruction. Results from grade school and college groups have been 
reported on attitudes toward racial groups, propaganda, treatment 
of criminals, patriotism, prohibition, war, and other topics Murphy 
et al, (1937) wrote a thorough discussion of such studies, which may 
be summarized as follows: 

1 High school groups are usually more susceptible to changes in attitude 
than college or adult groups but the changes may be less permanent. 

2. Changes, generally in the direction of liberalism, usually accompany 
both high school and college instruction in social, economic, and political 
studies. 

3. The attitudes of the instructor may be as important as the information 
which is discussed. 

4. A violent argument or episode tends to make people take sides, for a 
neutral position becomes untenable. The negative response to the appeal 
may be as great or greater than the positive. The distribution of scores tends 
to become flattened or bimodal. 

Biddle (1932) reported that among 350 high school and college 
students susceptibility to propaganda was greatly reduced by reading 
pamphlets on the techniques of propaganda. Among ten thousand 
college students Knower (1935) found that there was little difference 
between the effects of “emotional” and “rational” appeals concern- 
ing prohibition, and that significant changes were larger among all 
groups when students read the appeal than when they heard it de- 
livered orally. Other reports by Cherrmgton and Miller (1933) and 
Wilke (1934) indicate that speeches given in person or over the radio 
are at least as effective as printed presentations of the same material 

Changes in attitude which follow the viewing of a motion picture 
have been reported by Thurstone (1931) and his collaborators. For 
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instance, a film which showed Chinese people as able and artistic 
caused a mean change toward more favorable scores on the scale of 
attitude toward Chinese equal to ten times the standard error of the 
change. Another film in which some Chinese were pictured in a 
more unfavorable light resulted in a mean change to a less favorable 
attitude equal to 2.2 PE. Another film which showed bootlegging as 
a vice resulted in no significant change in attitude toward bootleg- 
ging, which w’as already unfavorable In other studies attitudes 
toward war, crime, and racial groups have, in general, showed that 
changes in scores occurred after the subjects viewed films, and these 
changes were retained for periods of as long as 19 months, with a 
gradual return toward the original position 

Murphy and Likert (1938) reported a study of the effects of writ- 
ten propaganda Fourteen test items were selected which appraised 
attitudes toward war, imperialism, and Negro activities. These were 
administered to two small groups of college students. A week later 
one gi'oup read propaganda material which argued for radical 
changes and the other group read material supporting conservative 
points of view The material consisted of seven short excerpts from 
speeches by army officers, college professors, and persons well known 
in public life. These selections could be read in about 30 minutes. 

A week after the propaganda material had been read, the test was 
repeated. Then each group was given material to read which sup- 
ported the point of view opposed to that which had been previously 
presented. Four weeks later the attitude tests were repeated. These 
three tests showed how much each student was affected by this 
amount of conservative and radical propaganda. Although there was 
some shifting of opinion among students, there was no marked 
tendency for those with radical inclinations to vary more or less than 
other students. In one of the two groups there was a slight tendency 
for the more radical students to show greater susceptibility to propa- 
ganda. The lack of positive results may be due to several factors — 
the combining in one score of attitudes toward several different 
issues, the use of several short excerpts fiom speeches rather than 
one long concentrated speech on one topic, the delay of from one to 
four weeks between reading the propaganda material and repeating 
the tests, and the sophistication of the students 
Cantril (1944) noted that public-opinion surveys showed that 
opinion was highly sensitive to important events, and that events 
were much more important than words or propaganda Opinion in 
this country seldojn anticipated emergencies and reacted to emergen- 
cies only when self-interest was involved Thus the German invasion 
of Czechoslovakia and Poland aroused few Americans, but the Ger- 
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man invasion of the Low Countries and Norway made people realize 
that a German victory would affect them personally, Cantril believed 
that if people in America were given ready access to information, 
public opinion would reveal a hard-headed common sense and would 
tend to agree with the opinions of experts. 

Remmers (1950) reported a practical way of measuring empathy 
or its opposite, the projection of one's own attitudes. He applied the 
method specifically to measuring the ‘"gap" between management's 
and labor's attitudes. The questionnaire “How Supervise" was filled 
out by a group of industrial managers and also by a gi'oup of labor 
leaders. The labor leaders were later asked to fill out the same ques- 
tionnaire the way they thought businessmen would answer it. The dif- 
ference between the mean scores of labor leaders on the two occasions 
showed their estimate of the gap between themselves and business- 
men. The difference between mean scores of labor leaders and busi- 
nessmen showed degree of empathy — the ability of the labor leader 
to put himself in the other person's place. The lesults furnished im- 
portant information on points of agreement and disagreement, both 
real and imaginary, and marked individual differences were found 
Modifications Due to Prestige Mai pie (1933) lepoited that high 
school seniors, college seniors, and adults, in that order, were suscepti- 
ble to the influence of majority opinion, although the differences 
between groups were not great The opinions of tlicse persons were 
secured on two occasions, the second a month later than the hist. 
They were asked to mark seventy-five controversial statements about 
government, war, races, schools, and morals to show agi cement, un- 
certainty, or disagreement. On the second occasion the group was 
supplied with a record of the answ’ers which had actually been given 
on the first occasion. Changes between two trials in the direction of 
group opinion were: for high school students 64 per cent, for college 
students 55 per cent, and for adults 40 per cent. A similar exper inieiit 
showed that the influence of opinions given by a group of 40 expeits 
was not quite so great as the influence of the opinions of one's own 
group, although both were marked 

A study by Moore (1921) reported that conservative students ac- 
cepted majority opinion more regulaily than radical students, but 
Murphy and Likert (1938) reported no lelationship betw'een ladical 
attitudes and changes in attitudes due to knowledge of majority 
opinion. They found that announcing the majority opinion oially 
after reading a statement on a second test did cause students to vary 
somewhat from the first test. The coi relations between scores on the 
internationalism scale and shifts toward majority opinion wcie, liow'- 
ever, nearly all zero. * 
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Modification Concuntng with Age. The work of Horowitz (1936) 
describ^ above gives an excellent illustration of age-group differ- 
ences in attitudes among grade school pupils. Two other illustrations 
are given below which are typical of the many reports available. 

Murphy and Likert (1938) reported that approximately 5 years 
after the first survey in 1929, a retest using three attitude scales was 
made of 129 individuals who had been graduated from college. The 
odd-even reliability of attitude scores was nearly the same as that 
reported for the first trial. The mean scores shifted slightly toward 
nioie liberal and radical attitudes The most significant differences 
were found on the scales for imperialism and economic practices 
The reasons for these shifts were discussed at length without any clear- 
cut conclusions. Data on personal incomes indicated that the shifts 
weie probably not due to personal want The authors believe that the 
change during 5 years probably reflected awareness of the seriousness 
of the causes of the widespread world depression 

The reasons for attending church were recorded by Kingsbury 
(1937) for a gioup of Protestant churchgoers in Chicago. The per- 
centages of persons w^ho checked eight reasons were found lor the four 
age groups fifteen to tw^enty-five, twenty-six to thirty-five, thirty-six 
to fifty, and over fifty. Nearly 80 per cent of the youngest group 
checked **to formulate a philosophy of life,” “to hear music and 
liteiatuie,” and “to gam new fiiends,” but only about 20 per cent of 
the oldest group checked these items. Thirty per cent or less of the 
youngest group reported that they attend church “to keep alive the 
spirit of Christ,” “to encourage family attendance,” or “from habit,” 
whereas nearly 80 per cent of the oldest group checked these reasons 
“To solve personal problems” was checked by approximately 50 
per cent of all ages “Just some place to go” dropped from 30 per cent 
to 7 per cent with thirty years of growth. “For reassurance of im- 
mortality” dropped from 30 per cent to nearly zero between the 
twenty- and thirty-year-old groups This reason became increasingly 
important with advance in age, 30 per cent of the fifty-year-olds hav- 
ing checked it. 

The Relation of Information to Attitudes 

The question whether accurate knowledge about a situation ac- 
companies a liberal attitude is one which can be answered by com- 
paring information test scores with attitude scale scores 

A remarkably widespread study by Watson (1929) reported the 
application of a questionnaire on Far Eastern relations among three 
thousand adults in church groups, prisons, business clubs, and 
schools. The questionnaire contains* both information items and 
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attitude items. The scores of information on Japanese issues cor- 
related .82 with favorable attitudes toward the Japanese, and in- 
formation on Chinese issues correlated .70 with favor toward Chinese 
nationalism. Another study by Manry (1927) found among college 
students a correlation of .69 between knowledge of international af- 
fairs and a favorable attitude toward world citizenship. Wrightstone 
(1934) found that historical knowledge correlated .58 with economic 
liberalism among four hundred pupils in the ninth grade to the 
twelfth grade. 

Reckless and Bringen (1933) found that among college students 
information about Negroes and their problems had a mean correla- 
tion of .64 with favorable attitudes toward the Negroes, 

In contrast to these results are those of Murphy and Likert (1938), 
who found a zero correlation between rough measures of informa- 
tion and libcialism in a study ol iiucmaliontilisni and atuuidcs to- 
ward the Negiocs. Biddle (1931) reported coneLitions between zcio 
and .26 between knowledge aiul iinia\orable attitudes tow aid Fili- 
pinos, Japanese, and Chinese. Bolton (1935) ioiind /cio coirelations 
betw'een information and attitude towaid Negioes ioi seven hundred 
college students Ihe low lelationships reported are jiiobably due 
in pait to sampling oi nanow' groups, to mixing ol issues, and to 
the liagmentai) riatine ol the tests. 

'Ihe geneial conclusion is that wcll-infoimed persons tisuall) rake 
a fairly liberal or cxijei'iniental view' on contioversial issues Pooily 
infoimcd persons are more likely to approve extieme radical oi 
reactionar) policies 

Cantiil (1914) reported scseial public-opinion polls winch show’cd 
that one effect ot greater information is to make the wcll-inlormed 
more sensitive to the implications oi points oi view. Person^, v\eli in- 
formed in one aiea, lor example, European affair^, tended to be well 
informed also in anothei area, such as Far Easter n affairs Whci e per- 
sonal wishes or identification was stiong, howev'er, gieatei iniorma- 
tion about a topic did not cany much v\ eight in opinion polls 

In a study of students in Bennington College and the Catholic Uni- 
versity olAiiieiica, Newcomb (1916) iouncl that the “attitude climate” 
of a gioup was related to the infoimation of the group Newcomb 
secured an indication of attitude climate by a cpicstionnane on at- 
titudes toward the Spanish Civil AV’ar parties in 1937 Instructions 
were given to make one of the follow mg answ ers to each item strongly 
agree, agree, uncertain, disagree, oi strongly disagree T)pical state- 
ments (Newcomb, 1946, p. 301) were. 

1 hope the Loyalists wmii the war. 

The real issue in this war is nationalism versus communism. 
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The conflict in Spain may be fairly accurately described as a German- 
Ilalian attack on the Spanish Goveiiiment 

The answers tvcie weighted so that the liigher scoics showed greater 
Pro-Nationalist sympathy and the lower scores gicatei Pro-Lo)alist 
sympathy Fiom this questionnaire a ciiiKal latio ol 16 5 was found 
between the mean scores of the two colleges 

New'comb then incasiiied information by three 2()-item tiue-false 
tests One test (p 301) contained items thought to be neutial, loi ex- 
ample; 

1 he present seat of the Losalisi Gosernment is in Madrid 

The goscrmncni in ])o\ser s\hen the cimI war broke out icprescnted a coa- 
lition of leli . 111(1 lilicral paitK s 

Another test (p 301) consisted ol Pro-Loyalist items such as* 

7 he Loyalist planes ha\e in no instance been guilty of shelling non- 
combatants 

Gcneial Franco’s goscrnnient has been rccogni/cd as the legitimate power 
in Spam only by gosenimcnis which are oscrly lascist or near fascist 

7'he third test (p. 301) consisted of items which were Pro-National- 
ist, as 

Indisputable esiclenic has been adduced showing that some clergy have 
been executed and many pci scented by loyalist svmpathi7crs 

The scores on these tests w'erc correlated with the scores on the 
attitude scales with the following lather striking results. 

Catholic 

Type of Iriformation Bennington University 
Neutral — 45 .38 

Pro-Loyahst — 57 — 08 

Pro-Nationalist —.04 .51 

These results show that the persons at Bennington w'lth most Pro- 
Loyalist sympathy had the most neiilral inform a Lion, w^hile the 
persons at Catholic University with the most Pio-Nationalist sym- 
pathy had greater neutial information. Also the Pio-Loyalist in- 
formation was correlated about .50 w'lth Pro-Loyalist feeling at 
Bennington, and Pro-Nationalist infoimation about .50 w'lth Pro- 
Nationalist feeling at Catholic University. Newcomb (p 292) be- 
lieved that the following hypotheses w^eic supported by these figures; 

] That indnidual information relevant to a social issue is dctci mined 
by degree of concern, opportunity for becoming familiar with the e\idence, 
and usefulness of information in supporting existing attitudes 



APPRAISALS OF ATTITUDES 623 

2. That the manner in which these factors serve to determine information 
is a function of attitude climate defined in terms of uniformity, direction, 
and intensity of the attitude in question, in a given community. 

The figures seem to indicate that degree of concern increased the 
amounts of neutral and favorable information learned. Opportunity 
or accessibility of information could not be well appraised in this 
experiment. The usefulness of information in supporting an attitude 
IS shown by the significant correlations bet\\een attitudes and favoi- 
able information and the near 7eio coirelatioiih between attitudes 
and unfavoiahle infoimation Newcomb points out that gcneiali/a- 
tioiis about atlitude-inlormation relationshijis can nevei be safely 
made without a careful analysis of the attitude climate 

Comparison of Various Techniques 

Bwgiaphies, Inventories, and Rating Scales, The question, wdiich 
techniques are most valuable in the study of attitudes, can only be 
answ'ered in terms of their accuiacy and their uses. The three most 
common techniques employ biographies, inventories, and ratings of 
general attitudes Nearly all writeis who have used case liistoncs or 
autobiographies report ihat these yield the most complete accounts. 
A good deal of emphasis is therefore being placed upon seruiing ac- 
curate and complete histones Comparisons of thorough case histories 
W'Uh measurement techniques have been made by several persons, 
Stoulfer (1930) compared a scale of attitudes toward prohibition with 
a WTitten account of activities, and w'lth a graphic lating of attitude 
towMid prohibition Two hundred and thirty-eight college students 
first completed the scale, and then wrote accounts, using approxi- 
mately one thousand w'ords, of their experiences with the prohibition 
law and drinking liqiioi. Four judges rated these accounts on a 
graphic scale of indulgence The agi cement of judges found by cor- 
lelaiiiig the rating of each judge with the ratings of tiic others ranged 
from .83 to 89, mean .87 Th^ae figures show a high degree of con- 
sistency '1 he rorrelatioii between scores on the attitude scale and the 
composite rating by lour judges w'as 81 

Ihe same students also rated themselves on two occasions with a 
graphic rating scale for attitudes toward piohibition When the two 
self-ratings w’ere combined, it was found that the combined self-rat- 
ings correlated .80 w’lth Smith's scale and 80 with judges' ratings of 
accounts These correlations are all high enough to suggest that the 
three methods were appraising the same aspects of persons with about 
the same accuracy. It is probable that each of the three methods in- 
troduces unique aspects which may be of importance in particular 
evaluations. 
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Another study by Watson (1925) compared scores on his test of 
public opinion with short descriptions wiitten by friends. The fol- 
lowing account is a descuplion oi the person whose profile is shown 
m Ulus. 201 . Both the profile and the account sliow extremely niaiked 
piefereiicc for fundamental Protestant beliels and stnet inoials, and 
a less inaiked preference for mild economic i clonus. 

This man is a farmer in a small Wsoming town He is the leading pillar 
in the Methodist church When I visited one Sunday he was dulling the 
childien in the Sunday School on the names of the persons and the signifuant 
dciails of each of the Old Testament miiacles IIis neighbois considcT him, 

next to , the most stul>boiii man ihc) ha\c known He is a rank-and- 

file member of the Farm Bureau, so should share some of their piogrcssivc 
economic ideas (from Watson, 1025, p 46 ) 

\ much moie elaborate study by Murphy and Likert (1938) also 
showed close coi responcleiire between scores on their test and auto- 
biographies written by studeius Se\cial cases were found, how'ever, 
where apparent inconsistencies were cleared up only by a careful 
reieachng of the autohiogiajdues. 

Thiirsione and Chave (1929) compaied scores on the Attitudc-lo- 
w'ard-the-CliLirch Scale with students’ self-iatings made at the same 
time A grajdiic rating scale w'.'is used which consisted of a hori7oiiLal 
line and the w’ords “stiongly fa\oTable ro the church * printed at one 
end, “neiuial” in the middle, and at tJie other end, “strongl} against 
the church " The roii elation hetw'cen scores and ratings was 67, a 
fanfy liigh figure The same students also indicated whether oi not 
the> attended chinch (rct|uently and were active members in a 
chinch. I’he frequent chuichgoers were ten times as nuineious among 
those whose scoies w’eie favoiable to the church as among those 
whose scores were iinfavoiable 

Median Scale Valve vs Total Scoie Likert (1932), using tw'o 
scoiing techniques, applied seieial of Thnisione’s scales. First, he 
used Thill stone’s method of having a pci son simply check the state- 
ments with w'hich he agieed, a proceduie which yields a median 
scale-value score. This scoic implies that a person agrees with all the 
statements which are less exticme than his own median statement 
lu practice this may not be the case. Likeit also used a total score 
method which required a person to rate the same items on a i-step 
scale, using the words (5) stiongly approve, (4) approve, (3) un- 
decided, (2) disappiove, (1) stiongly disapprove A person’s score w^as 
secured by adding the points checked on all the items A comparison 
of the results of these two techniques show^cd that the retest reliabiliiy 
for the median scale values w'as 76, whereas that foi total scores was 
.85. The scale values correlated with total scores .88 Thus it ap- 
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pears that while both methods of scoring yield similar results, the 
total-score method shows considerably more self-consistency. The 
reasons advanced for this superiority of total scores are: (1) each 
item bears a share of the score directly, and (2) persons can probably 
express themselves more accurately with a 5-step scale than with 
only two choices, as when given the choice of either agreeing oi dis- 
agreeing The total scoring technique can be applied without the 
elaborate scale construction which is preliminary to the median 
scale-value scores. Thurstone's scale consti action has the advantage, 
however, of guaranteeing the inclusion of items which represent a 
wide range of attitudes and an equal number of items at each level. 

These comparisons of various techniques lead to five tentative con- 
clusions. 

1. A scored biography and a questionnaire yield, under fairly ■well- 
controlled conditions, about equal retest consistencies, ranging from 
,60 to .80, depending upon the range of scores in the gioup, and the 
ambiguity of the items. Scores wdnch depend upon the summation of 
a large number of items show greater retest or odd-even reliability 
than scores which depend upon a few judgments, or on a median 
value. 

2. Self-ratings and self-inventories of activities seem to have about 
the same validity, as shown by agreement with various criteria. The 
selection of satisfactory criteria is particularly diflicult. 

3. The graphic rating of single general attitudes, as toward pro- 
hibition, is likely to yield retest reliabilities in the neighborhood of 
.70 among college groups, using the optimum number of steps. 

4. Convenience and economy in appraisal lie with the verbal in- 
ventories and graphic scales. 

5. All of the direct appraisals of attitudes are subject to intentional 
misrepresentations which are hard to detect. No indirect or observa- 
tional methods have been widely used. 

Group Norms 

A fairly large number of studies have been reported of differences 
in attitudes found in age and occupation, sex, race, locality, and 
educational and economic groups. Such differences have often been 
cited as indications of the validity of the scales. Most of these reports 
require a great deal of study because the interpretation of results in- 
volves the simultaneous evaluation of all factors which might con- 
tribute to attitudes. Thus a small difference between attitudes of 
men and women toward war may be due to differences in age, train- 
ing, and occupations, rather than to the difference in sex. No studies 
have come to hand where all these factors have been held constant. 
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although some studies have produced significant results. In view of 
the complexity of the material the reader is referred to the able dis- 
cussions by Murphy et aL (1937), Rundquist and Sletto (1936), All- 
port (1937), and Cantril (1944). 

NEEDED RESEARCH 

Research in this field is considerably behind research in the field of 
interests, discussed in Chapter XX. Additional research is much 
needed outside of the classroom. Comprehensive studies of attitudes 
evaluating age, race, experience, and other factors will be of great 
significance. Reduction of misrepresentation is a pressing problem, 
the solution of which may come from indirect self-ratings, or observa- 
tions made by others. The construction of less ambiguous test items is 
feasible. With such items thorough factorial analyses may indicate 
more clearly the basic patterns of attitudes of a group of persons, 

STUDY GUIDE QUESTIONS 

1. How are attitudes defined? How are they formed in an individual? 

2. Describe a method used to develop an attitude scale in which items are 
scaled in equally often-noticed units. 

3. What are the six fields of activity described by Spranger? 

4. What results have thus far been secured in measures of knowledge 
about an issue and the attitudes toward the issue? 

5. What are the relations between attitude categories and interest cate- 
gories (Chapter XX)? 

6. What methods seem to yield the best results in studying attitudes? 



CHAPTER XXII 


PERSONALITY 

INVENTORIES 




This chapter describes general and analytical questionnaires which 
are designed to appraise strength and types of impulses and typical 
adjustive activity. Certain uses of such questionnaires are indicated. 
Lastly, the content and scoring are discussed together with indica- 
tions of need for research. 

CHARACTERISTICS OF INVENTORIES 

One of the most natural developments of the interviewing tech- 
nique is to take questions which a good interviewer would ask and 
present them in written form to a subject. Many lists of such questions 
have been prepared The earlier lists included a large variety of mis- 
cellaneous items designed to yield a general index of maladjustment. 
Later, by the use of factorial and logical analyses, inventories were 
developed which yielded analytical scores corresponding to profiles 
of clinical syndromes or to psychological traits or sometimes to both. 

General Questionnaires 

One of the earliest of these is the Personal Data Sheet, prepared by 
Woodworth (1917) during World War I, which asks the subject to 
check one hundred and sixteen items which were derived from 
psychiatrists' descriptions of neurotic or of prepsychotic patients. 
Mathews (1923) adapted this form for use with children by changing 
some of the situations and language. Thurstone (1930) published an 
inventory of 223 items, some of which were from Woodworth and 
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other sources. McFarland and Seitz (19S8) published a 92-itein in- 
ventory, half of which related to somatic symptoms and the other 
half to mental situations or beliefs. 

During World War II, two short general inventories had consider- 
able application: the Cornell Index and the Personal Inventory. 

The Cornell Index This index (Weider, 1945) was issued in 
1948 by the Psychological Corporation, revised as Form N2 for civil- 
ian use. It IS designed to be a rough screening device for personal and 
psychosomatic disturbances. It consists of one hundred short ques- 
tions about behavior or symptoms, such as Do you frequently feel 
fainP The person being tested is asked to answer all the questions 
Yes or No, and if he is not sure to guess. The items are designed to 
represent the types of symptoms shown in Ulus. 210 

ILLUS, 210. CONTENT OF CORNELL INDEX 


Question 


No 

Defects in adj ustment expressed as feelings of fear and inadequacy 2-1 9 
Pathological mood leactions, especially depression 20-26 

Nervousness and anxiety 27-33 

Neurocirculatory psychosomatic symptoms 34-38 

Pathological startle reactions 39-46 

Other psychosomatic symptoms 47-61 

Hypochondriasis and asthenia 62-68 

Gastrointestinal ps)chosomatic symptoms 69-79 

Excessive sensitivity and suspiciousness 80-85 

Troublesome psychopathy 86-101 


(By permission of Arthur Weider and The Psychological Corporation.) 

The Index is administered to groups and without time limits. 
College students usually finish in 5 minutes, and those who have not 
finished grammar school take from 10 to 15 mmutes. The reliability 
is calculated by the Kuder-Richardson technique because the dis- 
tributions are usually very skewed. Among a thousand subjects tested 
the reliability was ,95. 

Norms are available only for male adults at this time, and they 
are given in form of cut-off score centiles (Ulus. 211) This shows, 
for instance, that if a cut-off score of seven had been used, 86 per cent 
of those rejected for military service would have been detected, and 
14 per cent not detected. At the same time 28 per cent of the nor- 
mals would have been tentatively classified as disturbed. In military 
selection the Index was a practical tool, because a large proportion 
of persons in trouble could thus be selected and sent for psychiatric 
interviews immediately and only about a quarter, 28 per cent, of the 
normals were so selected. The relatively small number of disturbed 
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ILLUS. 211. PER CENT OF REJECTS* IDENTIFIED AT VARIOUS 
CUTOFF LEVELS, CORNELL INDEX 


Cut’Off Level 

Per Cent of 

Psychiatric Rejects (400) 

Per Cent of 
Normal Rejects 

0 

100% 

100% 

1 

99 

82 

2 

97 

67 

3 

94 

64 

4 

93 

46 

5 

92 

39 

6 

90 

32 

7 

86 

28 

8 

85 

24 

9 

83 

20 

10 

81 

18 

11 

78 

16 

12 

76 

15 

13 

74 

13 

14 

72 

12 

16 

68 

10 

16 

66 

9 

17 

62 

8 

18 

61 

7 

19 

60 

7 

20 

57 

6 

21 

55 

5 

22 

53 

4 

23 

50 

4 

24 

48 

4 

25 

45 

3 

26 

42 

3 

27 

41 

3 

28 

40 

2 

29 

39 

1 

30 

35 

1 

31 

34 

1 

32 

32 

1 


* In terms of opinion at psychiatric interviews at five induction stations. 

(By permission of Arthur Weider and The Psychological Corporation ) 

persons not detected by the Index were later discovered during the 
first few weeks of service. 

*'Stop** items refer to crucial symptoms, such as, “Have you ever 
had a fit or convulsion?" or, “Were you ever a patient in a mental 
hospital?" These items are scored separately when desired. 

The Cornell Index was not found to be effective in indicating 
obsessive states, and does not screen hysterical palsies or prepsychotic 
and early psychotic states thoroughly. It is more effective in indicat- 
ing anxiety, hypochondriasis, asocial trends, convulsive disorders. 
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migraine, asthma, peptic ulcers, and borderline psychosomatic disor- 
ders. The Index is not designed to analyze difiSculties but to give a 
composite score. Although responses can be falsified, the authors be- 
lieve that the follow-up results show that gross falsification is un- 
common. 

The Personal Inventory. This inventory, by Shipley, Gray and 
Newbert (1946), is a questionnaire prepared originally for the United 
States Navy, but used by other branches of the military establish- 
ment for the purpose of screening large numbers of recruits or 
selectees. A long form contains 145 items, and a short form 20 items 
which were selected from the long form because of their capacity to 
distinguish between normal navy personnel and psychiatric dis- 
charges. The items are cast into a forced-choice form, and one is 
required to check the alternative which better describes himself 
One alternative is always more characteristic of the normal as re- 
vealed by a large-scale case-history and analysis The other character- 
izes the psychiatrically undesirable. An attempt was made to pair 
choices which are apparently equal in social desirability 

Upon analysis sixty items of the long form were found to have 
critical ratios of 2.7 or more between 1,004 normals and 84 psychiatric 
discharges. These sixty items were assigned a weight of one point each 
for purposes of scoring, and the other eighty-five items were retained 
as a filler and for experimental purposes 

The odd-even reliabilities of the short and long forms proved to be 
almost identical — 66 ior naval recruits and 91 among psychiatric 
discharges. The validity as shown by critical ratios between normal 
and psychiatric discharges was 1 8.5 for the long, and 20 9 for the 
short form. 

The correlations between the General Classification Test and the 
two forms weie both — .28 among normal recruits. The short form 
therefore seemed to be as good a testing instrument as the long form. 

Psychiatric Classifications 

The most carefully prepared and validated work in this field is 
the Minnesota Multiphasic Personality Inventory (MMPI), Hatha- 
way and McKinley (1942, 1947) This inventory was designed to aid 
in two basic aspects of clinical diagnosis* the quality of the disorder 
and the intensity or the amount of disturbance. In order to develop 
a scale for qualitative analysis, types of abnormalities were first de- 
fined, then groups of persons who clearly represented these types 
were secured, then the tests were applied to these groups and to 
normal persons Items which show differences between various groups 
were then selected, and arranged in a battery of tests The psychiatric 
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syndromes described by Kraepelin or his successors were used. Krae- 
pelin emphasized that these syndromes were complex patterns of 
behavior which appeared fairly frequently but were apparently not 
organic nor rigidly structured nor independent of each other. 

Hathaway and McKinley, working in a large mental hospital, se- 
cured samples of approximately eight hundred carefully studied 
clinical cases To these and to a group of normal adults they applied 
a battery of 550 items on separate cards. Each person was asked to 
respond by indicating whether or not the situation in the item was 
typical of his own situation. The three choices were: true or mostly 
tyuCj not usually or entiiely true, cannot say. The 550 questions were 
classified into twenty-six groups, as shown in Ulus 212. 

ILLUS 212 CONTENT OF MINNESOTA MULTIPHASIC PERSONALITY 
INVENTORY (MMPI) 

1. General health (9 items) 

2 General neurologic (19 items) 

3. Cranial nerves (11 items) 

4. Motility and coordination (6 items) 

5. Sensibility (5 items) 

6. Vasomotor, trophic, speech, secretory (10 items) 

7. Cardio-respiratory system (5 items) 

8 Gastro-intestinal system (11 items) 

9 Genito-urinary system (S items) 

10 Habits (19 Items) 

11. Family and marital (26 items) 

12. Occupational (18 items) 

13 Educational (12 items) 

14 Sexual attitudes (16 items) 

15. Religious attitudes (19 items) 

16. Political attitudes — ^law and order (46 items) 

17 Social attitudes (72 items) 

18 Alfect, depressive (32 items) 

19. Affect, manic (24 items) 

20. Obsessive and compulsive states (15 items). 

21. Delusions, hallucinations, illusions, ideas of reference (31 items) 

22 Phobias (29 Items) 

23 Sadistic, masochistic trends (7 items) 

24 Morale (33 items) 

25. Items pnmanly related to masculinity-femininity (55 items) 

26. Items to indicate whether the individual is trying to place himself in an ac- 
ceptable light (15 items) 

(By permission of Hathaway and McKinley and the University of Minnesota Press.) 

From these items scoring scales were derived by selecting items 
which showed the largest differences between normal and other 
groups, and between the various clinical groups. From 40 to 60 items 
are scored for each scale. The same item is often found in more than 
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one scale. Nine scales of personality characteristics, as they are called, 
are now available for scoring. These are described by Hathaway and 
McKinley somewhat as follows. 

a. Hypochondjiasis. This scale includes worry about bodily func- 
tions Usually the patient has a long history of exaggeration of 
physical complaints and of seeking sympathy. 

b. Hyjitena, This scale measures conversion-type symptoms, such 
as paralyses, contiactures, gastric or intestinal complaints, or cardiac 
symptoms They have attacks of weakness, fainting, or even epilepti- 
form con\ulsions. Hysterical cases aie more immature psychologically 
than any other group. Although their symptoms can often be miracu- 
lously ciiied by a strong emotional experience, there is great likeli- 
hood that other symptoms will appear if stiess continues or recurs. 

c Depression, This scale measures the depths of discouragement 
or lack of self-confidence, w'hich may be suicidal 

d. Hyponiania This scale measures overproductivity in thought 
and action The patient has usually got himself into trouble because 
he has undertaken too many things. He is overenthusiastic and over- 
active, and his activities may inteifere with other people through 
his attempts to reform social practice, or his stirring up of projects 
in which he soon loses interest, or his disregard of social conventions 

e. Psychopathic deviate. This scale measures a group of persons 
whose main difficulty lies in a usual absence of deep emotional re- 
sponses Nothing really matters They are commonly likeable and 
intelligent, but they frequently digress by lying, stealing, alcohol and 
drug addiction, and sexual immorality They may have short periods 
of disorientation and excitement or depression following a discovery 
of their antisocial acts. They differ from some criminals in that they 
seem to commit crimes with little thought of possible gain to them- 
selves or of avoiding discovery. 

/. Paranoia, This scale shows persons characterized by suspicious- 
ness, oversensitivity, and delusions of persecution. Patients with para- 
noid suspicions are common in many situations, and paranoiacs usu- 
ally appear normal when on guard. They are usually quick to take 
vengeance against anyone who tries to control them Persons with 
high scores on this scale must be handled with special appreciation 
of this possibility. 

g. Psychasthenia This scale shows persons with phobias or com- 
pulsive behavior, expressed in hand-washing, vacillation, or other 
ineffectual activities The patient has queer thoughts or obsessive 
ideas from which he cannot escape when awake or asleep, and which 
serve him as a symbolic protection. Many persons, however, have 
phobias, such as minor fears of snakes or spiders or locked doors. 
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without being greatly incapacitated \s long as they can avoid these 
things they opciatc on a fairly e\en keel 
h. Scluzoplnenia This scale measures lesponses uhuh aie bi- 
zarre and unusual, caused by a splitting of the subjects e lile of the 
person from i call tv He reacts almost e\clusi\el\ to his own thoughts, 
wishes, and feais Adsanced cases schloin consciousK respond to the 
environment loi long peiiods 

u j\fasnilniity-fe?n?niiiity This scale contains items which were 
selected to distinguish between the two sc\es in the normal group 
Some Items were inspired Iry the woik ol Terinan and Miles ^1936) 
Finally, there aic three scales the scores of vhidi are not indica- 
tive of clinical syndromes but show' peisonalliv tiaits One of these, 
called the K score, is the numbei of .insweis which are omitted be- 
cause the client cannot sa) or will not choose A large number of 
omitted cpiestions show's a tendency to withdraw oi \acillate An- 
other scoie, called the lie score, is composed of .inswcis to filtccn items 
whicli indicate rather gross exaggerations, such as “I always tell the 
truth ” Another score, called the validity scoic, is made up of an- 
swers to sixty-Iour items which have scicloiii been answciecl in the 
scored direction Iiv normal persons 'Fhese scores show citliei highly 
independent j^ersons or those who are neurotic or psychotic. I’he lat- 
ter usually leveal themselves by high scores in oihei scales as well. 

The present scales do not measure all cpialities of peisonaiity, and 
the authors promise that other scales will be developed as time goes 
on, show'ing groups of primary or closely associated traits Although 
it is thought that the personality chaiac ten sties named abov^e ai'c 
independent in the sense that they can occur in a peison independ- 
ently fiom any other trait, yet in practice they are oiicn lound to- 
gether. In fact, it IS seldom that a single characteristic is iound by it- 
self. 

Two administrative procedures are now available In one proce- 
dure each Item is on a separate card and the client sorts the cards 
into two piles — ugJit and w)on^ The other is called the group pioce- 
dure and in it the items are punted in a booklet. The subject marks 
an answ'cr sheet to show one of two choices — true or false. Ihis lorm 
may also be scored for only the hrst 367 items, thus i educing the time 
needed lor administration and scoring 

The scores for all scales are changed to standard scores for a large 
adult group wlicic llie mean is 50 and the standard deviation is 10 
On a profile chart (Illus 213) the heavy lines show values at 30 and 
70, representing two standarci deviations below and above the mean 
The highest scores alw’ays rcpicscnt the deviations toward abnormal- 
ity. Scores above 70 indicate either a bordeiliiie condition or a need 
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ILLUS 218. PROFILE OF MINNESOTA MULTIPHASIC PERSONALITY 
INVENTORY (MMPI) 


raonu ANP CASE StMMASY CARD 



for careful examination of the clinical evidence In some instances 
scores below 70 are also indicative of serious trouble. In looking at a 
profile, the two or three highest points must be considered together. 
Slight differences between several high points may be due to inefiEec- 
tive scaling. The authors reported test-retest reliabilities ranging 
from .71 to .83 for the various scales. 

The immediate diagnosis may not always agree with the highest 
point on the chart, for a person may show a violent symptom which 
is of less importance than another symptom of abnormal condition 
For instance, a client may show a great amount of depression with 
feelings of guilt, while at the same time he is progressively withdraw- 
ing from reality to a degree that may completely disorient him 
(schizophrenia). 

Psychological Classifications 

Other scales are based on psychological theories of personality, 
and are designed to include estimates of such traits as introversion- 
extroversion, ascendance-submission, sufficiency-dependence, and de- 
spondency-elation. Nearly thirty scales of this sort have been edited 
by various investigators. They are illustrated by the work of Bern- 
reuter (1931), Guilford-Zimmerman (1949), Adams (1945), Thorpe, 
Clark, and Ti^ (1946), Ruder (1948), and others 

The Bemreuter Personality Inventory. This test consists of 125 
items, similar to those in Ulus. 149, describing both adjustments and 
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interests. Each item is to be answered with yes, no, or unable to an- 
swer with yes or no. Four scores are obtained for each person from 
keys which were prepared on the basis of results from four previous 
tests: Thurs tone's (1930) Personality Schedule of Neurotic Tenden- 
cies, Laird's (1925) Inventory of Extroveision-Introversion, All- 
port's (1928) Ascendance-Submission Scale, and Bernreuter's Test of 
Self-Sufficiency These four tests and the Personality Inventory w^ere 
administered to adults selected in pait to represent ex ti erne groups. 
Each item in the inventory was con elated with total scores on each 
of the four tests. The answers to each item were assigned points on 
the basis of these correlations, the higher the congelation, the greater 
number of points allotted. For instance, the answers to the item, "Do 
you daydream frequently?" were given plus and minus values as 
follows: 




TEST 



Answer 

Neurotic 

Introversion 

Dominance 

Self-Sufficiency 

Yes 

5 

3 

—1 

1 

No 

-Ht 


1 

—1 

Doubtful 

“2 

0 

2 

-2 


Total scale scores were secured by adding the figures in each column. 
These totals were found to correlate highly with the corresponding 
previous tests. Thus, Bernreuter's score for neurotic tendencies cor- 
related .94 with Thurstone's schedule. Laird's and Bernreuter's in- 
troversion scores correlated .79, Allport's measure of ascendancy and 
Bernreuter's dominance correlated .81, and the two measures of 
self-sufficiency, .89 

The Bernreuter scores show high split-half reliability, median .90. 
Their intercorrelations are interesting. Neurotic tendencies corre- 
lated .96 with introversion. This shows either that the same persons 
have both sorts of adjustments, or that the two scales are measuring 
approximately the same patterns of behavior. Neurotic tendencies 
correlated with ascendancy .81, and with self-sufficiency .35. The self- 
sufficiency scores showed a low correlation with the others since self- 
sufficiency cuts across both ascendancy and introversion. 

These high intercorrelations led Flanagan (1935) to make a fac- 
torial analysis of Bernreuter’s scores from 305 eleventh-grade boys. 
He used Hotelling's method of principal components. Two factors 
appeared to account for the intercorrelation of the four Bernreuter 
scores. The first, a large factor in the test, is a combination of neu- 
rotic, introversion, submission, and low self-sufficiency items. Flana- 
gan named this lack of self-confidence. The second factor, a much 
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smaller one, was called sociability, Flanagan constructed two new 
scoring keys to aid in appraising these modes of behavior. 

The stability of responses on the Bernreuter scale was studied by 
Farnsworth (1938) Retests after 1, 2, and 3 years showed no signifi- 
cant shills in iiuIiMcIiial ccntilc lanks, and there wcie higli retest 
correlations. Foi the average jicison 71 pei cent ol single items were 
answered in identical lashioii altei an interval ot 1 year, 65 pei cent 
alter 2 veais, and 65 pci rent alcei 3 ycar^ 

T/ie Cuilfoid-Ziinmciinan Tcriipciamenl Siiivey, 19'I9, The 
woxk of many yeais ot Giiillord and his associates in the appiaisal 
of peisonaUty b) elaborate nuentoiics has lesnked in a 30l)-item 
questionnaire called rhe GuilloidVainnicinian Temper anient Sur- 
vey Chilly iieins arc jnoMcIed foi each ol ten tiaits, and no item 
IS used for appraising more ihan one iiait. Sc\en ol the tiaits are 
the same as those described lor the Giiillord-Martin Inventories 
(19-10, 19^5), and thiee ol the tiaits aie combinations ot six previously 
described trails AH ol GuiHord’s w'oik is chai actcu/ed by great care 
in the pieparaiion and the analysis ol items The traits have been 
defined by several factorial analyses on \arious populaiions, and the 
w'ording of items has been studied to increase then uniqueness ioi 
one trait I'he traits aie described as lollows 

G General acliviiy hurrying, liking for speed, liveliness, vitality, procliic- 
tion, ediciency, and their opposites 

R Reshaini’ seiious, tlcJibeMtc, per'jisteiu \ciaU'» carelicc, impulsive, 
e\cii(‘mcnt-lo\ mg 

A Ascnulanrr sell -defense, leadership, hlulfing, speaking in public ver- 
sus siibniissnencss and licsitcition 

S Sociability many fnends, se(*king friends and social activities, seeking 
lirncliglu scr'ius few friends and shyness 
E Emotional stability, eseiuiess ol moods, optimistic, composure versus 
fluctuation of moods, pessiraiain, day-dream ing, e\ritability, feelings 
ol guilt worry, loneliness, and ill health 
O Objedwity thick skinned, accurate observing versus hyper sensitne, 
self-centered, suspicious, having ideas of Lelercnce. 

F Fucndlniess' tact, acceptantc of domination, lespect for others versus 
hfistility, resenimenl, desiie to dominate, and contempt for others 
T Th ought fuhic^^ reflet ti\e, observing of self and otheis, mental poise 
versus intciest in oveit activity and mental disroncertedness 
P Personal relations tolerance of people, faith in social institutions 
versus fault-finding, imcoopeiative, suspicious, seU-piiymg 
M Masculinity interest in masculine activities, not easily disgusted, hard- 
boiled inhibits emotional expression, little interest in clothes and 
styles veisus easily disgusted, fearful, romantic, emotionally expies- 
sive, and dislike of vermin 
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The items are all in the form of statements, usually affirmative, and 
often using the second person pronoun, for example: 

You like to play practical jokes on others 

Most people are out to get more than they give. 

The affirmative form is preferred because it is usually a little simpler 
than a question, and according to Guilford and Zinimennan, may 
allay resistance and increase the number of projective answers. 

The answers are all to be placed on an IBM answer sheet by mark- 
ing yes, or no for each item The use of these three categories was 
determined by polling the attitudes of several hundred students to- 
ward different kinds of responses. About 60 per cent stated that they 
could not do without the question mark very well, and the prefer- 
ences were about equal for 3, 4, and 5 choices. 

In scoring the test one point is allowed for each item answered in 
the direction of the trait The question mark and the other possible 
response are not counted. The average proportion of persons scoring 
on each item was about .60, and the range Irom .10 to .90 The means 
of total scores for any trait center around 18, and the standard devia- 
tions are a little more than 5 points. The reliability coefficients for 
the various traits ranged from .75 to 87, and the standard error of an 
obtained score was approximately 2.5. The intercoi relations of trait 
scores for 266 college men show that traits S and A correlated .61; 
traits O and E correlated .69; all the rest were considerably lower, 
showing a desirable uniqueness. 

Separate norms are furnished for 523 college men and 389 college 
women. There are small differences between the mean scores of men 
and women The men had slightly higher scores for traits R, A, E, 
and O, and much higher scores for trait M. The women had higher 
means for traits S, F, and P, and the means w^ere the same for traits G 
and T. In all except trait M the overlapping of scores between the 
two groups was large Profile charts giving separate male and female 
norms are provided (Ulus. 214). Scores can be read from this in 
cen tiles, G scores, or T scores. No age differences have yet been found 
for an application of the form to high school students and their par- 
ents yielded similar distributions for the two groups. 

The interpretation of scores for both industrial and clinical use is 
indicated by Guilford and Zimmerman. Thus they have accumulated 
some evidence that supervisory and administrative personnel should 
have C scores between 5 and 9 for all except trait P , where the most 
favorable C scores are from 6 to 10. The least favorable scores are 
usually from 0 to 3 or 4. The authors point out that, while in general 
a high degree of a trait is good, clinically it must be considered along 




with other qualities Thus a high general activity scoie is good i£ 
combined with leflectiveness, and bad it combined with emotional 
instability or ronteinpt foi others The autliors have not yet furnished 
an\ indicators ot faking, oi the dependability of the scoie, but they 
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suggest that the number of question marks used as answers should 
not be more than three for any trait, because the standard deviation 
for a trait is approximately 6. Furthermore any person whose scores 
are all very high or very low should be interviewed to ascertain the 
behavior pattern involved. 

Attitude-Interest Analysis Test, Terman and Miles (1936) re- 
ported a 7-year investigation designed to give a more factual basis to 
concepts of masculinity and femininity. They devised two equivalent 
paper-and-pencil forms with contents as follow’s: 


Exercise 1, Word Association 60 items (.62)* 

Exercise 2. Ink Blot Association: 18 items, a rough silhouette is 
followed by four words ( 34) 

Exercise 3, Information. 70 items (.68) 

Exercise 4, Emotional and Ethical Responses: 105 items. ( 90) 

Exercise 5, Interest; 119 items. (80) 

Exercise 6, Personalities and Opinions consist of 41 items. (.64) 

Exercise 7, Introvert Responses consist of 42 items. (.32) 


* The figures in parentheses are average split-half reliabilities for both sexes 
combined, for ten narrow groups of about one hundred persons each 

Each exercise was prepared by trying out many more items on 
groups of from one hundred to two hundred of each sex in the eighth 
grade, high school, and college, then retaining only those items which 
showed reliable sex diflEerences. A scoring key, which assigned 
weights from +15 to —15, was devised for each response to each item. 
It was based on the degree to which the response distinguished be- 
tween groups of diflEerent sexes. Later trials on new groups showed 
conclusively that unweighted scores showed as much reliability and 
as much difference between the sexes as the weighted scores. 

Each exercise was given a different weight depending upon its own 
reliability, its discrimination between sexes, its independence from 
the other variables, and its standard deviation. The split-half reliabil- 
ities of subtests ranged from about .34 for Exercises 2 and 7, to .90 for 
Exercise 4. Thus Exercise 4, Emotional and Ethical Responses, is 
reliable enough to locate a person with reasonable accuracy by the 
use of one form of the test. If both forms are used all except exercises 
2 and 7 are reliable enough to compare small populations. Profiles 
of individuals are therefore not recommended from these tests. 

The split-half reliabilities of total scores on small populations of 
one sex only averaged .78, and for both sexes together .92, when only 
one form was used. When both forms were combined the reliabilities 
rose to .88 for one sex and .96 for both sexes. The standard error of a 
score on one form was approximately 15 points. The total scores on 
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one form range from — 200 to +200. The general-population average 
for males was +52 and for females —70. 

Terman reported that college sophomores easily faked masculine 
or feminine scores to an extreme extent, when they knew the pur- 
pose of the test and were asked to see how much they could change 
their scores. The test is called an Attitude-Interest Analysis, however, 
and only a few naive subjects suspect its purpose. 

Terman Jiiid Miles (1936) fuinishcd a number of iiiteiesting rom- 
paii-sons ol total scoics of racial, age, and sex-deJinquent groups, all 
of which show 'iMcIe asciage difleiciues between male and female 
groups, but some vaiiations Thus the aserage dillerenccs between 
sexes aie smaller foi peisons ovei seventy yeais ol age, for English 
piivatc school children, for tollege-of-inusic students, and for Japa- 
nese adolescents in Hawaii, than lor somewhat random high school 
or college gioups Male avciage scoies rose from about 43 for four- 
tecn-year-olds to 72 loi six teen-yeai -olds, and then decreased to 67 
for twenty-) ear-old college sophoiiioi Ci, to 58 loi adults liom iw'enty 
to thiity yeais, to 39 for fort) to filt) yeais, and to 10 or less for those 
over sixty yeais Female aveiage scoies lose from — 95 at lourreen 
year^ to — 60 at twenty yeais, and then decreased slowdy with age to 
—89 lor those sixty oi older 

Physical measuies show'cd no marked relationship with ^^-F scoies 
ol either men or women, but more rcseaich is needed Slight iclation- 
ships w’eic found between masculinity and height of males, and be- 
tween femininity and length of trunk in relation to height in fe- 
males The relation of M-F scores to occupations was studied by 
comparing the means of small groups of peisons in diffeicnt occupa- 
tions Joiniialisb, clergymen, and artists a\eiageci about 16; police 
and firemen 28, faniicis and building trades 33, clerks and mei- 
chants ^12, mechanics, teacheis, physicians, surgeons, dentists f 6, law- 
yers salesmen, bankers 58, engineers and architects 81, and college 
athletes 93 Among w^oinen those in domestic occupations had the 
low’est average scores, about —100; stenographers, dressmakers and 
hairdressers aveiaged —90, musicians and artists —80, clerks and 
business W'oiiien —78, teachers —70, nurses —63, and physical-educa- 
tion teachers —36. Among both men and women the relationship be- 
tw^ecn masculinity or lack of femininity and an education-mtelligenre 
factor w»as significant The more educated showed higher scores "Ihe 
differences were more marked among men than among women 

The relation of M-F scores with alleged interests, as shown by a 
self-iating on tw'ehc fields of activity, was studied for 212 male adults 
and 533 female adults with high school educations The men with 
“very much inteiest" in science, mechanics, sports, and travel had 
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average M-F scores of from 51 to 59, while those with “very much 
interest" m art averaged 16, in domestic arts 28, in religion 31, and 
in music 38. The M-F score of those with high interest in literature, 
politics, pets, and social life ranged from 42 to 46 Among the women 
similar trends appeared. Those with “very much inteiest” in me- 
chanics, sports, politics, and pets had a\eiage M-F scoies of from 
— 77 to — 66, while the scores of those with high interest in reli- 
gion, art, domestic art, music, and social life ranged from — 93 to 
-^ 86 . 

Finally, a detailed analysis and classification of items showed males 
to be more interested m adventure, outdooi strenuous occupations, 
machinery and tools, science, and business, while lemales w^ere nu)ie 
interested in domestic, artistic, humanitai ian, and social aftans. Emo- 
tionally the males maniiested gi eater self-assertion and aggressiveness, 
fearlessness, and roughness of manners and language, while the fe- 
males showed more timidity, sympath’v, fastidiousness, and weakness 
in emotional control Neither sex show^ed any superiority over the 
other in unselfishness or moral principles or leasoiiing No evidence 
was presented as to the i elation between sex differences and innate 
factors It w^as pointed out that additional studies where either en- 
vironment or inheritance are closely controlled aic necessaiy to throw 
light on the relation of sex differences and culture. 

Sheldon's Scale for Tempeiatnent. One theory of the dynamics 
of personality is based on the idea that physical or physiological char- 
acteristics largely determine dynamic patterns of behavior. 

Sheldon (1942) developed a Temperament Scale (Ulus. 215) using 
a preliminary list of 50 traits and a 5-point scale with a group of 
thirty-three male college graduates He found three clusters of traits 
which correlated highly with one another and much less wdth traits in 
the other two clusters Several similar experiments carried on during 
a peiiod of 4 years resulted in three clusters of tw^enty traits each 
The first group of traits, which he called viscei otonia, typify a person 
who IS overrelaxed, gluttonous, ovcrsocialized, too dependent upon 
people, and overcomplacent, and who looks backward toward child- 
hood. The second group, called somatotonia, includes characteristics 
of those who are extremely aggressive, energetic, dominating, fond 
of risk, combative, ruthless, loud, active, and adjusted to the present. 
The third group, called cei ebrotonia, includes traits of the person 
who is unusually tense, restrained, sensitive, secretive, inhibited, 
intent, and emotionally involved, and looks toward the future Shel- 
don believed that ideally the scale should be made up of sets of 3-way 
traits, such as is shown in Ulus 215, Item 1, where relaxation has two 
opposites, one in assertiveness, and the other in restraint or tightness. 
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The three temperament types are determined by rating each trait 
on a 7-point scale thus: 

4% 1. Extreme antithesis is shown to the trait 

15% 2 Trait is weak, although there are traces 

29% 3. Trait is present, but falls a little below average. 

29% 4. About half-way between extremes 

15% 5 Trait is strong, but not outstanding 

6% 6. Tr.iit IS \ei) strong and conspicuous 

2% 7 Extreme manifestation 


ILLUS 215 SFJLLDON’S SCVLE FOR TLMPER \ME.NT 


Name Date 


I*hoio No 


Scored by 


I \isciRoro\r\ 


II SOMA70rOM\ 


in n'RiimoroNiv 


( ) • 1 Relaxation iii pos- ( 

til re and niovc- 
niciu 

( ) 2 Lo\c of pin SIC a 1 ( 

com foil 

( ) 3 Slow icaction ( 

4 Lo\c of eating ( 

.. 5 Socialization of 
eating 


) • J Assertiveness of 
posture and 
mosement 

) 2 Io\c of ijh)sical 
nchentute 
) 3 "I he energetic 
chnraciciiscic 
) 4 Need and enjov- 
inent of exercise 
5 Lose of doininac* 
mg, lust for power 


6 Pleasure in di- ( 
gcsi ion 


0 Ia)\c of risk and 
chance 


( ) 7 Lose of polite 
ceremony 


f ) 7 Bold directness of 
maimer 


( ) 8 Sociophilia ( 

9 Indiscriminate ( 

amiabilitv 
10 Greed for affec- 
tion and approval 


) 8. ]’h}sical courage 
for combat 

) 9. Competitive ag- 
gr cssi veness 
10 Psschological 
callousness 


11 Orientation to 
people 

( ) 12 Evenness of emo- 
tional floiv 

( ) 15 'Jolerance 


11 Clausnopliobia 

12 Riilhlcssness, 
fiecdom fiom 
scjucainisbiH'ss 

( ) 13 The tinresli allied 
\oice 


( ) 14. Complacenc) .. II Spartan indifTer- 

ence to pain 


{ ) * 1 Rcstiaint m pos- 
luie and move- 
ment, tiglunc'ss 
2 Phssio logical 
ovcjicsponse 

( ) 3 Ovcily fast reac- 
tions 

( ) 4 Love of piivacy 

( ) 5 Mental ovenn- 
tensit), h)pcrat- 
tcniionahty, ap- 
pi el lensi veness 

( ) 6 Sccictiveness of 
feeling, emotional 
lestramt 

( ) 7 Self-conscious mo- 
Lilit) of the eyes 
and face 

( ) 8 Sociopliobia 

( ) 9 Inhibited social 
adclicss 

. . 10 Resistance to 

habit, and poor 
roucini/iiig 

. 11 Agoinphobia 

.. 12 Unpicdictability 

of attitude 

f ) 13 Vocal restraint, 
and gc'neial re- 
straint of noise 
. 14 Hv pel sensitivity 
to pain 
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ILLUS 215 SHELDON’S SCALE FOR TEMPERAMENT {Confd^ 

I. VrSCrROTONIA II SOMATOTONI\ HI. CFRLBROTONIA 

15 Deep sleep 15 General noisiness 15 Poor sleep habits, 

chronic fatigue 

( ) 16 The untempered ( ) 16 Oveiinatunt) of ( ) 16 \outhfiil intent- 
characlenstic appearance nev> of manner 

and n})pearance 

( ) 17 Smooth, easy com- 17. Hoinontal men- 17 Vertical mental 
munication of tal cleavage, e\- cleavage, intro- 

feeling, extraver- traversion of so- version 

Sion of visceio- matotoiiia 

tonia 

18. Relaxation and 18 Assertiveness and , 18 Resistance to al- 

sociophilia under aggression under cohol, and to 

alcohol alcohol other depressant 

drugs 

. 19 Need of people 19 Need of action 19 Need of solitude 

when troubled when troubled uhen troubled 

20. Orientation 20 Orientation 20 Orientation 

toward childhood tow aid goals and toward the later 

and family rela- activities of youth periods of life 

tionships 

• The thirty tiaits with parentheses before them constitute collectively the short 
form of the scale 

(By permission of W. H. Sheldon and Harper & Bros ) 

There are more cases at the lower extreme because the antithesis of 
a trait may take two forms, while there can be only one extreme mani- 
festation Rating 1 IS therefore about twice as common as lating 7. 
A mean for each type is secured and written as IT, Index of Tempera- 
ment^ thus an IT 243 indicates a mean rating of 2 tor viscerotonia, 4 
for somatotonia, and 3 for cerebrotonia For 200 thoroughly studied 
cases the split-half reliabilities for these ratings were approximately 
90. The correlations between temperament types and body types 
were found to be approximately .80, and in rare cases where tempera- 
ment varied by 2 or more points from the corresponding morpholog- 
ical predominance, difficult adjustments or maladjustments were the 
rule. 

Sheldon also gives evidence that poor emotional adjustment and 
unsatisfactory achievement are related to differences or conflict be- 
tween ideals or habits and body types For instance, a man with a 
263 morphological index would have a bad time trying to live a life 
of a theoretical scientist, but would adjust well as an athlete, or 
playground director. 

Sheldon has analyzed two hundred male adults and grouped them 
according to degree of good adaptation: 



644 


DYNAMIC PATTERNS 

14% 

Group I. 

Superior adaptation 

64% 

Group 2 

Well adapted: 

2a Naturall) .iclaptcd 

2b 0\ ere amc diiriciil lies 

17% 

Group 3 

Socially unadaptable 

0\ei endowed, etiomoipliic 

3/; 0\ertoiiipensaied, mesomorphic 
3r Reversals of doimnaiue 

3d Sex-env iionmciit ( ladi 

5% 

Group 4 

Clon'jUtiiiional inferiors 

Aa U nder endow ed 


He found a wide vauetv ol types in each of the fust two groups, 
but Gioup I contained 26 pei cent of those who weie predomiiianrly 
rerebrotoiuc s, and onl) 1 per cent of \iscercLonics, 6 pei cent of 
soniatotonics, and 8 pci cent of balanced types In geneiaJ, Gioup 2 
contained most ol the dominant viscereronics (75 pei cent), and 
smailei piopoitions of the other types Group 3 had piactically no 
visceretonics, but Group 4 had 14 pei cent ot them. 

Sheldon has defined a number ol othei indices loi total dysplasia, 
gynaiichonioipliy, gynandrophrenia, tcxtinal components, health, 
central stiength, pliysical intelligence (how cfTcctnely one uses his 
muscles), acsihctic intelligence (sensitive appreciation of one’s en- 
vironment), and manifest sexuality, which aie to be rated independ- 
ently of the morphological and temperamental naits. The gyiian- 
droinorphy index is intciesting because two peisons can be neaily the 
same m total bodily pattern but differ considerably in sexual com- 
ponents 

Ghild and Sheldon (1941) made a study among Harvard undei- 
graclnates of coiielations between somatotjpes and ability and pci- 
sonality test scores None of the coiielations w'erc significant, but 
the results cannot be considered to be ccincliisivc, because ol narrow 
sampling and vague analvsis of the traits measured Sheldon points 
out that an enormous amount ol caiefiil research is needed to deter- 
iiiine the significance of physique and related ph)siological reactions 
in personality patterns, and to secure moie adecpiate norms for var- 
ious age groups, for w’onieu, and foi lacial gioups 

CatielVs Personal Chmacteristics One of the most thorough at- 
tempts to include all possible pcisonalitv characteristics in an anal- 
ysis is that of R. B. Cattcll (1947) who collected more than 1,800 trait 
names from psychiatric, psychological, and literary sources and re- 
duced them to 171 names that scern important and somcwdiat in- 
dependent Ratings were obtained on one hundred adults for 171 
traits, and the intcicoi relations of all 171 traits w’cie analy 7 ccl. Traits 
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which correlated at least .45 with one another were grouped to- 
gether. This resulted in thirty-five clusters of traits (Ulus. 216) Each 
of these clusters was phrased as a single trait and was used in rating 
another group of 208 male adults. From a factorial analysis of these 
ratings, eleven factors emerge which Cattell considers basic or pri- 
mary. 

ILLUS 216 C4TTELUS THIRT'i-FIVE CLUSTERS OF TRAITS 
DEFINITIONS 


1 Readiness to cooperate 
Generally tends to say yes when in- 
vited to cooperate. Outgoing. Ready 
to meet people at least halfway. 
Finds way of cooperating despite 
difficulties 

2 Emotionally stable 

Can be depended upon to look at 
questions objectively, without emo- 
tional prejudice, and in the same 
constant light fiom day to day 
Above emotion in his judgments. 
Dependable and realistic. 

3 Attention-getting 

Shows off in company. Not happy 
unless in center of the stage Talks 
about self, accomplishments, im- 
portant friends, etc Likely to show 
some ‘‘affected” behavior. 

4. Assertive, self-assured 

Assumes he can impose his (or her) 
will on others Tends to lead or 
influence his associates Tends to 
dominate Tends to be boastful and 
assertive Not held back by doubts. 
Invulnerable self-esteem 

5 Depressed, solemn 

Earnest and solemn most of the 
time Not easily moved to laughter. 
Seeming slow and depressed rather 
frequently. 

6. Frivolous 

7. Attentive to people 

8 Easily upset 

9 Languid, slow 

10 Boorish 

11 Suspicious 


vs Obstructs encss 

Inclined to raise objections to a proj- 
ect, anical or realistic. “Cannot l>e 
done” Unmteicstcd or unfavorable 
attitude to joining in Inchnetl to be 
“difficult ” 

vs Changeaiile 

Sees things in terms of the emotion 
of the moment. Emotional bias 
changes from day to day and place 
to place. Does not remain the same 
person from day to day Undepend- 
able 

vs Self-sufficient 

Not under compulsion to impress 
or to get sympathy or attention. 


vs Submissive 

Tends to let other people have their 
way Tends to back down m a con- 
flict. Humble, quiet, retiring. Not 
sure he is right “Embariassable ” 


vs Cheerful 

Generally bubbling over with good 
cheer Optimistic. Enthusiastic. 
Prone to cheerful, witty remarks. 
“Laughterful ” 

vs Responsible 
vs Cool, aloof 
us Unshakable poise, tough 
vs Energetic, alert 
vs Intellectual, cultured 
U 5 Trustful 
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ILLUS. 216. CATTELL’S THIRT^'-FIVE CLUSTERS OF TR.^ITS (Cant'd) 

DEFINITIONS 


12. Good-natured, easygoing 

vs 

Spiteful, grasping, critical 

13. Calm, phlegmatic 

vs 

Emotional 

H. Hypochondriacal 

vs 

Not so 

15. Mild, self-effacing 

vs 

Self-willed, egotistic 

16 Silent, introspective 

vs 

Talkatne 

1 7 Persev enng, detenu ined 

vs 

Quitting, fickle 

18 Cauttoits, retiring, timid 

vs 

Adventurous, bold 

19. Hard, stern 

vs 

Kincllv, soft-hearted 

20 Insistcnth orderly 

vs 

Relaxed, indolent 

21 Polished 

vs 

Clumsy, awkuaid 

22. Prone to jealousy 

vs 

Not prone to jealousy 

23. Rigid 

vs 

Adaptable 

24 Demanding, impatient 

vs 

Emotionally mature 

25 L> neon veil tional, eccentric 

vs 

Conventional 

26 Placid 

vs 

\V^orrying, anxious 

27 Conscientious 

vs 

Somewhat unscrupulous 

28. Composed 

vs 

Shy, bashful 

29 Sensitively imaginative 

vs 

Fiactical, logical 

30 Neurotic fatigue 

vs 

Absence of neurotic fatigue 

31. Esthetically fastidious 

vs 

Lacking artistic feeling 

32 Marked interest in opposite se\ 

vs 

Slight interest in opposite sex 

33. Frank, expressive 

vs 

Secretive, reserved 

34 Gregarious, soaable 

vs 

Self-contained 

35 Dependent, immature 

vs 

Independen t-minded 


(By permission of R B Cactell, 1947, and the editor of Psychometrtka ) 


The reliability of these ratings was determined by correlating the 
mean of a group of eight raters with the mean of a second group of 
eight raters. Correlations ranged from .51 to .60 for a group of men, 
which is about 20 points lower than similar correlations for a group 
of college women. This finding is typical of the care with which col- 
lege women rated one another and the lack of care by the men. The 
thirty-five traits were then correlated with one another and the matrix 
was analyzed by factorial analysis, using the centroid method. The 
first five traits below appear to be clearly determined and the last six 
have smaller loadings. The letters assigned to these factors were de- 
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rived from a previous study. The present order indicates the size of 
factorial loadings. The factors are named (Cattell, 1947, p. 211) as fol- 
lows* 

E Dominance versus suhmisstveness. The traits which contribute to this 
factor are variables 4, 33, 35, 19, 26, and 5. (Illus. 198) 

G Positive integration versus immaturity Strong, silent, hard, thoughtful, 
stable versus weak, slipshod, quitting and changeable, social person. 
The traits contributing most highly to this factor are 7, 2, 6, 24, and 17. 
H Charitable, adventurous, cydoihymia veisus withdrawn: The traits 
contributing to this factor are 34, 32, 27, 28, and 6. This is supposed to 
be the constitutional factor lacking in schizophrenic tendency It brings 
in good character qualities, sex interest, and conscientiousness 
F Surgency versus desurgency Its representative traits are 9, 5, 13, 18. 

The person is energetic, cheerful, talkative, with some show of emotion. 
A Cyclothymia versus schizothymia- This is represented by traits 1, 24, 12, 
15, 7, 26, and 23 These show a factor which it is not easy to distinguish 
from Factor H, the mam diEEerence being that Factor H emphasizes 
withdrawal, shyness, bashfulness, and cautiousness, while Factor A 
stresses obstruction, spitefulness, worrying, anxiety, and a rigid be- 
havior, This is probably a marked contribution to our knowledge of 
personality structure, for two independent traits appear here as in 
previous studies by Cattell and others 

K Trained, socialized, cultured mind versus boorishness: 10, 28, 21, 29, 
8, 27. 

B Intelligence versus mental defect: 20, 27, 17, 10. 

I Sensitive, imaginative, emotionality versus rigid, mature, poise, 24, 35, 
31, 6, 3, 36, 29. 

J Thoughtful, neurasthenic versus vigorous, simple character, 1, 13, 4, 
8, 10, 30. 

M Spiessburger concemedness versus Bohemian mtellectualism: 25, 29, 
26, 23, 10. 

L Paranoic schizothymia versus sensitive, trustful acceptability: This fac- 
tor again brings in schizophrenia but with an emphasis this time on 
suspicion and jealousy: 22, 26, 36, 28, 11. 

A previous study was made of men averaging thirty-five years of 
age, and it was thought to be a more significant one for adults than 
the study reported here upon college students. The same factors ap 
peared, however, in nearly the same order. Cattell has shown that 
this evidence has been supported from many sources, and he be- 
lieves that these eleven traits are basic personality factors which 
remain fairly constant in different populations. 

In order to provide an instrument to measure the personality fac- 
tors isolated by his careful research, Cattell and his associates have 
published the Sixteen Factor Personality Questionnaire (1950). It is 
unique in that each item has a known saturation with all factors, but 
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it is scored only for the one factor which it best represents. By in- 
cluding the factors listed on page 645 the authors believe that no 
important aspect of personality has been omitted. Each factor is ap- 
praised by from 20 to 26 items when Forms A and B are used. EaA 
form can usually be completed in 30 minutes or less time. 

There are three types of items. One type asks an opinion about one- 
self to be answered with yes, tn between, or no. An example of the 
questions is, “Are you well described as a happy, nonchalant per- 
son?” Another type of item is designed to measure intellect through 
knowledge of word relations, such as, “Which word does not belong 
witli the other two^ North, East, Down/" A third type of item asks one 
to choose between two occupations, activities, or values For example, 
“Would vou rather work as an engineci oi a social science tcacheP” 
For most of the items theie is a scoie of 0, 1, oi 2, to be recorded 
according to a key. The raw score totals lor each factor are changed 
to standaid scores and placed on a piohle In general a high scoi e in- 
dicates a strong de\elopmcnL of the positive aspect of the (actor, and 
a low’ score the negatisc aspect of the factor This schema is similar 
to that used by Guilford and ^^arLin (1945), but difleis from that of 
Sheldon (1942) wdio proposed factors ha\ing three poles 

Catrclf and Luboisky (1912), stimulated by Freud’s hypothesis that 
wit is an cxpicssion of neecU w’hicli aic lepiessed in cseryday life, 
cxpeiimentcd with the possibility of measuring personalit) chai- 
acterisdcs by one’s icaction to sanous kinds of jokes One ol the re- 
sults is the C-L Humor Test (1919) Form A of this test consists of 
ninety-one pairs of jokes One ol each pair is to be marked as ‘ more 
amusing” than the other, not “more witty, or tasteful, or intellec- 
tual ” All example is 

a) Shall I clip the ends of your hair, b) Chatty assistant, Shall I go over 
sir? It again, sir^ 

No thanks One end will be suffi- No I heard every word you said 
cient 

Form B of the C-L Humor Test consists of 112 jokes The person be- 
ing tested IS to indicate opposite each joke w^hethei he thinks it is 
funny or dull Both forms w’ere selected from a much larger sample 
of jokes as the result of three successive researches which brought to 
light eleven correlation clusters. Fach cliistet is now repicscntcd in 
each form by from seven to fourteen items. The raw scores, wdiich are 
the numbei of items preferred in each cluster, are changed to stand- 
ard scores of adults and placed on a profile. The inleipietation of this 
prohle should be made in the light of a number of other vaiiables, 
but in general it is believed that a high score indicates both a strong 
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drive and a high degree of inhibition against the drive. The eleven 
dusters are described as follows: 

L Debonair sexuality versus guilt, inhibition. These are correlated with 
sociable, happy-go-lucky versus shy, anxious reactions to jokes 

2. Derision versus pathos and calm acceptance. These jokes on the one 
hand deride stupidity, lanness, gullibility and innocence, and on the 
other show a %vry acceptance of human fate. 

3. Self -composure versus nervous insecurity These jokes show enjoyment 
of shocking events or some reassurance m the face of a nervous doubt. 

4 . Disiegard of conventions versus light badinage Positne jokes indi- 
cate pleasure in violating contentions. 

5. Negativism versus secure robust enjoyment. These jokes ridicule per- 
sons who customarily receive some deference, such as the parson or the 
father, or at the opposite pole there is robust enjoyment without spite- 
fulness. 

€ Resigned masculinity versus pleasure in active discomfiture. The posi- 
tive jokes involve a blunt aggression against men and the negative tilt 
at the foibles of women 

7 Ironic dominance versus masochism. The positive jokes play without 
spite on weaknesses of people, while the negative stress self-punishment 
and also attacks on others. 

8 . Good-natuied play versus smart wit. The positive are slapstick jokes 
and the negative indicate sophisticated criticism with a slight tone of 
disgust. 

9. Wanton aggressiveness versus whimsy. The positive jokes show un- 
provoked aggression which brings surprise and discomfort to well- 
meaning people. The negative show cheerful acceptance of the blows 
of fortune 

10 Sociable good humor versus dry comment. The positive jokes have a 
hale-fellow-well-met mood, while the negative show dry or even bitter 
aggression. 

11. Cynicism versus intellectual play. The positive jokes are critical of a 
wide range of moral values* deceit pays, culture is hypocrisy, etc. The 
negative tend to be a play on words. 

An inspection of these clusters seems to indicate considerable over- 
lapping among them, but Gattell points out that with these, as in 
odier personality appraisals, valid tests cannot be based on inspec- 
tion, but rather on a goodly number of empirical checks. This type 
of test has the advantage of being interesting for most adults, hard to 
fake, and even in its present form fairly reliable. 

Other Scales, a. The Kuder Preference Record Personal (1948) 
is a questionnaire which is the result of several years of research. It 
involves approximately forty-two thousand correlations between var- 
ious scales and items in experimental samples. In his search for in- 
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dependent variables, Kuder found five and these are represented in 
this questionnaire. They are called preferences for 

Sociable activities, taking the lead and being the center of a group. 

Practical activities: dealing ivith external needs and getting things done. 

Theoretical activities, thinking, speculating. 

Agreeable activities* those which make for smooth, pleasant personal re- 
lations. 

Dominant activities, use of authority and power 

The questionnaire consists of 168 items, each of which lists three 
activities One is asked to indicate the most liked and the least liked 
of the three. He is also asked to choose as if he were equally familiar 
with all the activities, and to put dotvn his first reaction There are 
no time limits. In addition to the five trait scores, a verification score 
is computed to identify those wdio have answered carelessly or not 
followed directions. It is also pointed out that the obtained scores 
may not indicate usual behavior, but rather, wishful thinking. In- 
deed the directions in this and other preference questionnaires seem 
to encourage such answers Norms and prediction studies are now 
being developed. 

b The Mental Health Analysis of Thorpe, Clark, and Tiegs 
(1946) comes in four levels elementary (grades 4 to 8), intermediate 
(grades 7 to 10), secondary (grades 9 to college), and adult. At each 
level the same traits are evaluated by two hundred items Some of 
the Items are repeated in the various levels Each item is a short state- 
ment to be answered by yes or no. The autliors have tried to disguise 
questions which might conflict with one's tendency to protect him- 
self. Thus, instead of asking. Are you immature? the question is. 
Are you quick enough to get the best seats at a program^ And 
instead of. Do you offend people? they ask. Have you found that 
many people's feelings are easily hurt? The Lewerenz Vocabulary 
Grade Placement Formula was followed to keep the language diffi- 
culties considerably below the grade levels of those to be tested. 

The traits, which are each evaluated by twenty items, are divided 
into two main groups, liabilities and assets, but since each trait is 
considered a continuum and all high scores are assets, there is a 
total over-all score which gives a general index of mental health 
The traits are called. 

I-A. Behavioral Immaturity, The behaviorally immature individual re- 
acts on the basis of childhood (infantile) ideas and desires. He has not 
learned to assume responsibility for, or to accept the consequences of, his 
own acts. He attempts to solve his problems by such childish methods as 
sulking, crying, pouting, hitting others, or pretending to be ill. He has 
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failed to develop emotional control and thinks primarily m terms of him* 
self and his own comfort. 

I-B. Emotional Instability^ The individual who is emotionally unstable 
is characteristically sensitive, tense, and given to excessive self-concern. He 
may substitute the joys of a fantasy world for actual successes in real life. 
He may develop one or more physical symptoms designed to provide him 
with an escape from responsiijilities and thus to dimmish his distress. He is 
quick to make excuses for failure and to take advantage of those who will 
serve him. 

I-C Feelings of Inadequacy The inadequate individual feels inferior 
and incompetent. This feeling may be related not only to particular skills 
or abilities but may be general in nature Such a person feels that he is 
not well regarded by others, that people have little faith in his future pos- 
sibilities, and that he is unsuccessful socially. He feels that he is left out of 
things because he is unattractive and because he lacks ability. 

I-D. Physical Defects, The individual who possesses one or more physi- 
cal defects is likely to respond with feelings of inferiority because of un- 
favorable comparisons or of handicaps in competition with other persons. 
It IS usually not the physical defect per se that brings unhappiness but the 
restrictions and social disapprovals which come in its wake. Thus the ex- 
tremely short, the homely, or the crippled individual may feel tliat his handi- 
cap is insurmountable. 

I- E. Nervous Manifestations The individual who is suffering from 
nervous symptoms manifests one or more of a variety of what appear to be 
physical disorders such as eye strain, loss of appetite, inability to sleep, 
chronic weariness, or dizzy spells. Persons of this kind may be exhibiting 
physical (functional) expressions of emotional conflicts Stuttering, tics, and 
other spasmodic or restless movements are also symptomatic of this type of 
mental ill-health. 

II- A. Close Personal Relationships, The individual who possesses this 
asset to mental health counts among his acquaintances some in whom he 
can confide, who show genuine respect for him as a person, and who welcome 
close friendship of a warm and substantial nature. Such an individual en- 
joys a sense of security and well-being because of having status with those 
who mean something to his welfare. 

II-B. Inter-Personal Skills. The socially skillful individual gets along 
well with other people. He understands their motives and is solicitous of 
their welfare. He goes out of his way to be of assistance to both friends and 
strangers and is tactful in his dealings with them The socially skillful person 
subordinates his egoistic tendencies in favor of the needs and activities of 
his associates. 

II-C Social Participation. The socially adjusted individual participates 
in a number of group activities in which cooperation and mutuality are 
in evidence In contrast to the isolate who prefers his own company, the 
mentally healthy individual enjoys the companionship of others. His will- 
ingness to contribute to the success of group endeavors provides him with 
the feeling of belongingness and of having status which his nature requireSt 
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II-D. Satisfying Work and Recreation, The well-adjusted individual ex- 
periences success and satisfaction in his work, whether it be the seeking of 
an education or occupational relationships in the world of professions, in- 
dustry, or business. He also participates in a variety of hobbies and recrea- 
tional activities which provide release from tension He will have chosen 
tasks that challenge him and that satisfy his need for approval and a sense 
of achievement 

II-E. Outlook and Goah The mentally healthy individual has a satis- 
fying philosophy of life that guides his behavior in harmony with socially 
acceptable, ethical, and moral principles. He also understands his environ- 
ment and the forces and cause and effect relationships which shape his 
destiny as a member of a social group. He establishes approved personal 
goals and makes reasonable progress toward their attainment. 

The Kuder-Richaidson reliabilities based on 1,225 cases were 967 
for total score, 935 for liabilities, and .931 for assets. The desirable 
response for liability items is no, and for the asset items is yes. Each 
item IS marked with a tiny letter to show the trait which it evaluates. 
Scores are the totals of desirable i espouses for each trait. The authors 
recommend that the test be used in industry for selecting employees, 
up-grading employees, increasing employee efficiency, and improving 
employee-management relations Clinical uses are also listed, and 
the usual causes of disturbances and methods of treatment are out- 
lined. The centile norms for each level of maturity group are given 
on one chart, for it was found that sex and age differences were in- 
significant within the groups. 

c. The California Test of Personality, Tiegs, Clark, and Thorpe 
(1934, 1943) has five diffeient levels, ranging from the first grade to 
adulthood. On each level 144 items are divided equally into twelve 
sections. Each section is designed to evaluate a component of per- 
sonality. The first six components are related to self-adjustment in 
that they show how a person feels about himself The items are 
called: self-reliance, sense of personal worth, sense of personal free- 
dom, feeling of belonging, withdrawing tendencies, and nervous 
symptoms. The second group is related to social adjustment, and 
these items indicate how one feels toward others or gets along with 
others. They are called: social standing, social skill, freedom from 
anti-social tendencies, family relations, school relations, and com- 
munity relations. Percentile ranks are furnished for each level, both 
sexes together. The authors have found small differences, but have 
thought that these were not large enough to require separate norms 
for boys and girls. A number of studies have been made surveying 
mental-health problems among school children and adults. In gen- 
eral, the maladjustment corresponds to poor parental and home re- 
lations, speech defects, and neurotic traits. One studv compared the 
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scores of 303 ninth-grade pupils with the multiple-choice Ink Blot 
scores. It showed that those who seemed to be maladjusted according 
to the Rorschach Test made a higher number of undesirable responses 
on the California Test of Personality than those who were apparently 
adjusted according to the Rorschach Test. Bi-serial correlations of 
approximately .30 were found between the Rorschach scores and the 
total social-adjustment and self-adjustment scores. Another study 
reported the applications on students from grade four to eight of 
the California Test of Personality and a sociometric study using 
three questions: “Who is your best friend?“ “Who do you like to 
work with?” and “Who among your companions would you like to 
be like?” This study showed that superior adjustment, as indicated 
on the personality test, is not widely recognized by childien as a 
criterion for acceptance. Acceptance or aspiration was more closely 
associated with above average accomplishment and aggressiveness. 

d. Forer (1948) published a Diagnostic Interest Blank designed to 
give data on psychodynamics, such as fiustrated overt needs, psycho- 
logical defenses, basic needs of which the individual may not be 
aware. The blank consists of lists of 88 hobbies and sports, 70 per- 
sonal characteristics, 22 reading interests, 74 occupational interests, 
and 25 secret hopes and ambitions. The items are ail single words or 
short phrases, such as “play baseball, optimistic, tall, insists on his 
rights, mechanic magazines, teach others, and become invisible at 
will.” The directions ask the testee to circle yes, U, or no to indicate 
whether the item is characteristic, unimportant, or not characteristic 
of his ideal person or the person he would like to be. Forer believed 
that the use of the ideal person leads to freer expression of needs and 
impulses, than the use of self. Six scoring categories are suggested: 

1. Social orientation, personal or interpersonal emphasis on cooperation, 
competition, autism, narcissism. 

2 Group identification' conformity, moral values. 

3 Major role, dominant or dependent, diffuse or rigid, occupational 
orientation and consistency. 

4 Means of achieving goals: acceptance of responsibility, attitude toward 
discipline. 

5. Realism: practical interest, fantasy, occultism, over-extension versus 
reasonable selection. 

6, Sexual adjustment: acceptance of opposite sex, conditions of accept- 
ance, compensatory factors. 

No quantitative scales have as yet been set for this blank, but 
qualitative interpretations of results are described. The results from 
this blank must be interpreted in part by comparing them with a 
case history showing usual social, recreational, and occupational be- 



654 


DYNAMIC PATTERNS 


havior. The discrepancies between the actual and the ideal will show 
areas of frustration, which may be further explored for content and 
basic conflicts by a clinician. 

Sociological Classifications 

Several inventories have been issued which indicate the adequacy 
of adjustments in what may be called sociological areas. Widely used 
examples of these are the Minnesota Scale lor the Survey of Opinions 
(1986), the Bell Adjustment Inventory (1934, 1938), and the Mooney 
Problem Check List (1913). 

The Minnesota Scale for the Suwey of Opinions In construct- 
ing scales to be used for measuring the effects of the depression on 
personality and famil) life of >oung people, Rundcjuist and Sletto 
(1936) chew most ol the piehmmary items from their own logical 
consideiations Care was exercised to secure inlornial language and 
clarity of statement The personal pronoun was avoided since they 
believed that impel sonal statements ol majority opinion might be 
more frank Positive and negative statements were made in equal 
numbers, the positne usually expressing optimistic or socially ac- 
ceptable \iews The items w’ere submitted to gioups of students for 
criticism Statements containing the woids all, always, none, and never 
were consideied to be unsatisfactory, for students sometimes dis- 
agreed wdth this type ol statement although agreeing with the main 
conclusion of the item. 

The statistical selection of items included two considerations* 
leliability and discriminative ability The reliability of each item 
was secured by giving the scales twice within a \seek Items which 
showed wide fluctuations were discarded. The discrimination of an 
Item was measuied by noting differences in answers to it by the high- 
est and low'est quartei of a group. The items finally selected usually 
show^ed the largest diffeiences More negative than positive statements 
were found to be discriminative 

The final inventoiy of 132 items contains scales for measuring six 
aspects of adjustment 

1. Morale feelings of inability to cope with one’s problems 

2. Social mferiouty feelings of inability to succeed in association with 
others 

3. Family ideas about pleasantness and intimacy of family life 

4. Law. attitudes toward legal institutions 

5. Economics: conservatism and radicalism 

6 Education belief about the values of education 

Each scale consists of 22 items, and each item is to be marked on a 1 
to 5 scale, using the words strongly agree, agree, undecided, disagree, 
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ILLUS 217. MINNESOTA SCALE FOR 1 HE SURVEY OF OPINIONS 

Directions READ EACH ITEM CAREFULLY AND LT^’DERLINE QUICKLY THE 
PHRASE WHICH BEST EXPRESSES YOUR FEELING ABOUT THE STATEMENT 
"Wherever possible, let your own piersonal expenence determine your answer Do not 
spend much time on any item If m doubt, underline the phrase which seems most nearly 
to express your present feeling about the statement WORK RAPIDLY Be sure to 
answer every item 

1 THE FUTURE IS TOO UNCERTAIN FOR A PERSON TO PLAN ON MARRY- 
ING 

Strongly agree ® Agree * Undecided » Disagree * Strongly disagree * 

2 AFTER BEING CAUGHT IN A MISTAKE, IT IS HARD TO DO GOOD W ORK 
FOR A WHILE 

Strongly agree * Agree * Undecided * Disagree * Strongly disagree * 

3 HOME IS THE MOST PLEASANT PLACE IN THE WORLD 

Strongly agree ^ Agree * Undecided ’ Disagree * Strongly disagree * 

4 THE LAW' PROTECTS PROPERTY RIGHTS AT THE EXPENSE OF HUMAN 
RIGHTS 

Strongly agree * Agree * Undecided * Disagree * Strongly disagree ^ 

5 THE GOVERNMENT SHOULD TAKE OVER ALL LARGE INDUSTRIES 

Strongly agree ® Agree * Undecided * Disagree * Strongly disagree ^ 

6 A MAN CAN LEARN MORE BY WORKING FOUR YEARS THAN BY GOING 
TO HIGH SCHOOL 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree * 

7. IT IS DIFFICULT TO THINK CLEARLY THESE DAYS. 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree * 

8 IT IS EASY TO EXPRESS ONE’S IDEAS 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree ■ 

9 PARENTS EXPECT TOO MUCH FROM THEIR CHILDREN 

Strongly agree * Agree * Undecided * Disagree * Strongly disagree ^ 

10 A PERSON SHOULD OBEY ONLY THOSE LAWTS WHICH SEEM REASON- 
ABLE 

Strongly a^e * Agree * Undeaded * Disagree * Strongly disagree ' 

11 LABOR SHOULD HAVE MUCH MORE VOICE IN DECIDING GOVERNMENT 
POLiaES 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree ^ 

12. THE MORE EDUCATION A MAN HAS THE BETTER HE IS ABLE TO 
ENJOY LIFE 

Strongly agree ^ Agree * Undecided * Disagree * Strongly disagree * 

13 THE FUTURE LOOKS VERY BLACK 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree ^ 

14 IT IS DIFHCULT TO SAY THE RIGHT THING AT THE RIGHT TIME, 

Strongly agree * Agree * Undeaded * Disagree * Strongly disagree ^ 

15 ONE OUGHT TO DISCUSS IMPORTANT PLANS WITH MEMBERS OF HIS 

FAMILY 

Strongly agree ^ Agree * Undeaded * Disagree * Strongly disagree • 

16 IT IS ALL RIGHT TO EVADE THE LAW IF YOU DO NOT ACTUALLY 
VIOLATE IT 

Strongly agree * Agree * Undecided » Disagree * Strongly disagree ^ 

17 LEGISLATURES ARE TOO READY TO PASS LAWS TO CURB BUSINESS 
FREEDOM 

Strongly agree ^ Agree* Undeaded* Disagree* Strongly disagree * 

18 EDUCATION HELPS A PERSON TO USE HIS LEIStHlE TIME TO BETTER 
ADVANTAGE 

Strongly agree ^ Agree * Undecided * Disagree * Strongly disagree » 

(Rundquist and Sletto, 1936. By permission of the University of Minnesota 

Press.) 
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and strongly disagree. Illustration 217 shows the first eighteen items of 
which 1, 7, and 13 are scored for morale, 2, 8, and 14 for social in- 
feriority, etc. 

Norms are available for 1,000 young persons — 400 from college, 
200 from regular high schools, and 400 from continuation high school 
classes. No significant differences between high school and college stu- 
dents were found. Individual profiles can be made to show one’s posi- 
tion in this sample of persons. The scores on the separate scales had 
leliability roerficieiits of approximately .S-") The intei con elaf ions 
between these six scales foi a sample of five hundred young adults 
were. 
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6 Lducation 

Morale scoics were found to be most highly correlated with the other 
scores, and economic conseisatism the least highly coirclatcd 
In a small gioiip of high school students correlations between these 
adjusmienr scoies and menial ability were low 'iMth two exceptions 
Morale correlated with honor points 506 for Irovs but not Jor giils, 
and attitude toward cciiicatjon coi related - 10 with IQ for girls but 
not for boys The conelations ol IQ's and honoi points weie 17 for 
boys and .56 for girls The authors’ original work should be consulted 
for a clearer analssis ol their icsults fioiii applying this scale to many 
paiciit-child, age, and social gioups 

The Bell Adjust went luuetiloiy (1031, 1938). "J his invontoiv con- 
sists of 1()0 Items divided equally among five areas ol adjustments 
home, health, other persons, emotional disturbances, and occupa- 
tions (Ilhis 156) The items were selected fiom a preliminary set on 
the basis of thcir discrimination between the upper and lower 15 
per cent of individuals wdien ranked for total adjustment scoies Each 
item is to be checked ye^, no, or Sepaiatc scores arc available lor 
each field as well as the total The odd-even roriclations ranged from 
81 to 91 for the separate sections, and to .94 for the total scoies 
The intcrcoi relations of the separate sections ranged from — .06 to 
.51, median 21, wdiich is an indication of a inaiked degree of inde- 
pendence among these self-ratings Two forms of this inventory aie 
now’ available, one for adults, the other for students in higli school 
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and college. This is a widely used inventory, the value of which has 
been reported in more than fifty technical articles. 

Mooney Problem Check Lists. Three check lists for the study of 
student problems have been devised by Mooney (1946), one for col- 
lege, one for high school, and one for junior high school. In each list 
the student is asked to ‘*Read the list slowly and as you come to a 
problem which is troubling you, draw a line under it For example, 
if you are often bothered by headaches, you would dra%v a line under 
the first item, like this. ‘1. Often have headaches. * ” On the last page 
the student is asked to answer such questions as. 

1. Which of the problems you ha\e marked are troubling you most? 
Write about two or three of these if you caie to 

2. Have you enjoyed using this check list of problems? 

6. Would you like to spend more time in school trying to do something 
about some of your problems? 

4. Would you like to talk to someone about your problems? 

The college and high school check lists each have 330 short items 
which are classified into eleven areas. (Sample items are given in 
parentheses.) 

1. Health and physical development. (Often get sick) 

2. Finances, living conditions, and employment. (Too crowded at home) 

3. Social and recreational activities (Slow m getting acquainted with 
people) 

4 Courtship, sex, and marriage. (Boy friend) (Too few dates) 

5 Social-psychological relations. (Unpopular) (Being snubbed) 

6. Personal psychological relations. (Too easily discouraged) 

7- Morals and religion (Drinking) (Dislike church service) 

8. Home and family (Family quarrels) (Want to leave home) 

9 The future, vocational and educational. (Need to decide upon a vo- 
cation) 

10. Adjustment to school work. (Getting low grades) 

11. Curriculum and teaching procedures (Tests unfair) 

The junior high school list includes 210 items grouped in seven 
areas, which are similar to numbers 1, 4, 6, 7, 8, 9, and 10 above. The 
average number of problems marked by about 1,000 college students 
was 30, by 1,025 high school students 27, and by 684 junior high 
school students 23. The range was from 16 to more than 100 prob- 
lems. Over 90 per cent of college and high school students indicated 
that they enjoyed filling out the check list. More than 60 per cent 
in college and 70 per cent in high school requested a chance to talk 
over their problems with someone. The manuals for administering 
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these check lists give samples of individual counseling uses and re- 
search applications The relations of problems to age, sex, and locality 
have been tentatively found. More research is in progress. 

COMPARISON OF SCALES, NEEDS FOR RESEARCH 
Correlational Analyses 

In view of the fact that nearly all of the appraisals of modes of 
behavior thus far discussed secure total scores by adding together 
Items wlnrh arc somewhat ambiguous, factorial analysis ot scoics 
cannot be expected to give clcai lesults Such analyses, how'evei, aie 
of value in showing whethci the items w^liich have been grouped to- 
gether on the basis of logical or empirical fonsideratioiia aie found 
to dcjieiid upon one or upon seveial lactois The statistical ajiproath 
may, if caiefiilly intcrpieted, fuinish an answer to the question, 
What aie the mam inclependeni inodes oi behavior in a particular 
group ol persons? Ihe methods of Thurstoiie and ol Spearman are 
most frcqucntlv used. Since they employ somewhat difrerciit procc- 
diiies, both will be illustrated 

Ri'sults from Thn)sio7ic'6 Method Veinon (1938) applied Thnr- 
stone’s technique to the results of the elaboiate Boyd Personality 
Questionnaire given to fifty men and fifty women student teach- 
eis The 175 correlations among nineteen separate scoies aveiaged 
.366 m .00 when collected foi attenuation Only lour nidependenl 
factors wcic needed to account for the coirelation iiialrix Vernon 
identified them tentatively as 

1. Selj-de predation This was promincnr in items which emphasize de- 
pression, iiistabilicy. anxiei). shrinking from icsponsibility, and lack of self- 
sufficiency This was the hugest lactoi ft accounted for 41 per tent of the 
total variance Ihc next 3 factors together accounted lor only 35 jjei cent 

2 Caie[rrenesjt This was prominent in items representing suggestibility, 
fiecdom Irom worries, dissoci.ition, inability to concentrate, lark of definite 
intcrcsis, and freedom from tenseness 

3 Siuipuloinnew This is found m obsessional carefulness, strong sell- 
con uol of feelings, fieeclom from emotional thinking, strong concentration, 
and acting readily without pressure 

1 A factor which difjeientwted men from women Women showed 
stronger dislikes and fears, and more instability and dependency Men 
shovsed more scrupulousness, inability to concentrate, and introspectivc- 
ncss 

A careful study by Layman (1937) obtained tw’elve factors from 
correlations between sixty-seven items which had been answered by 
276 students. The twelve factors w^ere tcntaiivcly named 
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Gregariousness 
Social inadequacy 
Social initiative 
Social aggressiveness 
Changeability of interests 
Self-sufficiency, or independence 


Inferiority, or lack of confidence 

Impulsiveness 

Moodiness 

Sensitivity 

Emotional intro\ersion 
Inability to face reality 


The first four factors in this list are modes of social adjustment. 
They indicate four independent modes of behavior, not a single 
elemental sociability. Although these four modes are not easy to 
distinguish, they seem to be reasonable subdivisions of social activity. 
Thus, a person may be gregarious, that is, desire the company of 
others, because he feels socially inadequate, because he is aggressive, 
or because he desires to initiate some cooperative venture. There may 
be other independent aspects of social contact which will appear in 
other investigations. The last eight factors listed by Layman seem to 
be emotional adjustments. It is difficult to secure an accurate verbal 
description of emotional modes of behavior from this material, but 
the factor names are fairly descriptive and they confirm the work of 
others. 

Carter, Conrad, and Jones (1935) used Thurstone's method to 
analyze an inventory of children’s annoyances and their relationship 
with a measure of intelligence. They found three independent factors 
called (1) general annoyability, (2) annoyance at untidiness, and (3) 
personal annoyance from interest in self-esteem. Intelligence showed 
a negative relationship with the tendency to be annoyed. 

A factorial analysis by Burt (1938) is of unusual interest because he 
arrived at essentially the same results, using correlations among traits 
and among persons. His sample included 124 persons chosen from 
a larger group because they all had similar average ratings on eleven 
emotional tendencies. Analyses following the metliods of Spearman, 
Thurstone, and Kelley produced factors of general emotionality, of 
aggressive-inhibitive emotions, and of pleasurable-unpleasurable emo- 
tions. 

Thorndike (1936) applied Thurstone’s (1935) method to social- 
intelligence and mental-alertness tests. He found that the social-intel- 
ligence tests measure primarily the ability to understand and to work 
with words, which is such a large factor in verbal-intelligence tests. 

A factorial analysis was made by Brogden (1940) of intelligence 
and character tests applied to one hundred sixth grade children. He 
used the Otis Group Intelligence Test and thirty performance tests 
which included 4 designed to measure honesty; 3, perseveration; 3, 
persistence; 2, slang usage, and 1 to indicate each of the following: 
inhibition, suggestibility, conscientiousness, deportment, and grades 
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in school subjects. By using Thurstone's (1935) iiierhocl seven factors 
were delineated. 

1. Resistance to suggestion, similar to Speai man’s IT 

2. Honesty, as sliown by iinwilliiigne‘‘S to chc.ir 

3. Peisisteficej as shown by continuing woik in spite of fatigue, boredom, 
or distraction 

4. Verbal facility, as shown bv the Olis 'lest, similar to Thuisione’s V 

5. A factor piohably i elated to lensontng, not clearly identified 

6. Self (ontiol^ or dutifulness 

7. Actepiatuc of a moral rode 

Spec) tnan'i Method. During Spearman's i eseaich on cognitive ex- 
perience, he found exidenccs of tiaits which coiresponded to varia- 
tions in eneigv '‘Ihesc were railed Perseveration, Oscillation, and 
Will. 

1) Perseveiafion (P) and Its Opposite^ Fluency {F) The idea that 
persons differ in then ability to shift rajDidly from one activity to 
anotlier has been investigated by many since \Vicisuia (lOOb) reported 
tests of speed of scnsoiy adaptation A survey of claboiate studies by 
Lankes (1911) and Cattcll (1934) shows that five kinds of tests have 
been used 

Persistence of sensoiy after-effect, as shown by 

J Speed at which differ cut colored sectors of a color -disk fuse called 
flicker limen 

2 Time needed for light and dark visual adaptation 

3 I ime needed lor lecovoiy of heaiing after a loud noise 

4 fime needed for rccovei) of touch alter a severe eiectnc shock 

Spontaneous rennrence of an expctience 

1 In free association the tendenev to give the same reaction to the 
same or to diflcrcnt stimulus w’ords 

2 Direct cjiicstion'i about tunes, [loetry, phrases, problems, and dreams 
coming to mind again and again 

Hindi ance of neio menial aclnniy by similcu past activity 

1 Comparison of writing S’s continuous!) with writing them .is they 
appear in a minor This reversal teclinique is also applied to various 
letters, numbers, and drawings 

2. Comparison ol immediate recall of a drawing with recall which has 
been delayed by the exposure of a second drawing Tins technique 
was also applieci to short narratives 

3 Direct questions on effects of being interiuptcd in various tasks, 
homesickness, seasickness, desire for change, and tendency to finish 
a task, although a reason no longer exists tor completing it 

Usual rates of activity 

I. Natural rate ol ia]>ping. The subject was told to tap with his finger 
just as he feels incbneci at the time 
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2 Speed of free association 

Emotional patterns 

1. Aroused with difficulty but long in duration 

2. Pessimism and lethargy 

3. Few likes 

4. Either unusually submissive or else negativistic 

5. Inability to make small decisions quickl) 

6 Either untruthful or punctiliously truthful 

7. Gives up easily 

The results of applications of such tests, usually to small numbers 
of students, have been summarized by Spearman (1938). From cor- 
relation analyses he concludes that there exists in various amounts in 
each person a tendency for mental processes to lag. It is measured 
by inertia or slowness in shifting energy from one ariangement to 
another 

Line and Griffin (1935) applied a factorial analysis to tests of word 
association, reaction time, oscillation, perseveration, and Bernreuter 
scores, in an attempt to find the factors underlying mental health. A 
major factor emerged, which was called objectivity, and thought to 
be related to Spearman's fluency (F), which is the reverse of persevera- 
tion or inertia (P). 

A factorial research by Kleemeier and Dudek (1950) investigated 
the possible existence of an independent factor which they defined 
as flexibility. A battery of thirteen tests was composed and applied 
to 205 college students. The battery contained nine tests where no 
flexibility was required, and four where some flexibility was re- 
quired The no-flexibility tests required one to add single or 2-digit 
numbers in one test, or in another test to subtract similar numbers. 
The flexibility test required one to shift from addition to subtrac- 
tion, and vice versa in random order Other no-flexibility tests re- 
quired the addition of a final letter to make a word and the addition 
of an initial letter to make a word. In a flexibility test these require- 
ments were varied in a random fashion without any instructions to 
indicate whether the answer was an initial or a final letter. Still other 
no-flexibility tests consisted of printed rows of M's among which 
were a few N’s. One was asked to count the N’s. A similar test required 
one to count all the W’s in rows of M's. The flexibility test here was 
composed of rows of M's some of which contained N's and some W's. 
All of these tests were prepared in two equivalent forms and ad- 
ministered in a rotated order to avoid practice effects. The centroid 
analysis did not reveal any common elements in the flexibility tests 
although four well-defined factors appeared corresponding to per- 
ceptual speed, verbal ability, single-digit, and double-digit computa- 
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tion, which accounted very well for the variances of the scores on all 
the tests. This study points to the conclusion that an independent 
factor of flexibility is not necessary to explain the results of these 
tests. It may be that in much more complex situations a flexibility 
factor will appear. 

2) Oscillation (O). Another energy adjustment which Spearman 
finds unique is fluctuation or oscillation. It is supposed to be shown 
by: 

a. The duration of attention shown in the waxing and waning of faint 
sensory stimuli, such as a light or small weight 

b. The fluctuations in stereoscopic vision between the patterns which 
will not fuse, and also in monocular \ision, die fluctuations in reversi- 
ble perspective drawings 

c. Changes in rate of continuous wwk in aiming or cancellation test, or 
crossing out circles 

Measures of this factor are not related to perseveration. Although 
both measure rapidity of change of some central mechanism, oscilla- 
tion is dependent upon recuperation from fatigue, while persevera- 
tion is not due to fatigue but to difficulties in changing direction of 
energy. By analogy, oscillation would correspond to variations of the 
steam pressure in a power plant, perseveration to the time required 
for the steam to be turned off for one machine and turned on for an- 
other; and g, or intellectual power, to the average steam pressure 
maintained in the plant 

3) Will (W) On the basis of quantitative results, Aveling (1926) 
concluded that conation in the sense of striving for a goal was quite 
different from volition in the sense of resolving or selecting a goal 
If we accept the former as a rough definition of will, it is possible to 
assemble three types of appraisals which have been used by many 
investigators: 

1. Tests involving variations in effort, shown by test of 
a Quality of handwriting, Courtis (1925) 

b. General mental tests 

c. Rate tests 

d Tests of persistence in the face of distractions (see Hartshorne 
and May, 1930) 

2. Ratings of behavior, such as those by Webb (1915) and Bern- 
reuter (1931) 

3. Observations of effort and persistence, such as time-sampling 
and log records 

Results from variations of incentives indicate that the amount of 
effort which is effective usually increases with the complexity of the 
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task or the speed with which the task must be performed. Great ef- 
fort, however, such as produced by a much desired prize, tends to 
increase speed at the expense of accuracy, even when aimed at pro- 
ducing more accuracy. Effort when effective in unspeeded tests of 
complex sorts seems to direct one toward relevant processes and 
continued action In gi*oup measures variations in effort seldom pro- 
duce any significant changes in intercorrelations. 

Mailer (1934) reported an analysis of correlations of measures of 
four aspects of character, honesty, cooperation, inhibition, and per- 
sistence. To measure and estimate such behavior he applied an 
elaborate schedule of tests used by Hartshorne and May. A total of 
708 pupils in three schools which served persons in upper, middle, 
and lower economic levels, were tested. The correlation between the 
scores taken to represent the four phases of character were low but 
positive, mean .29. The tetrad diffeiences resulting from the analysis 
of separate tables of correlations for each school gi'oup were ex- 
tremely small and in no instance more than three times their respec- 
tive PE's. According to Spearman's logic, all the correlations may, 
therefore, be attributable to the presence of one common factor. 
When two measures of mental ability were introduced into the cor- 
relation matrices, the tetrad differences were large, hence the factor 
common to the behavior tests cannot be identified with g. Mailer 
believed that the common factor was a readiness to forego an im- 
mediate goal for the sake of a remote but more valuable goal. Be- 
havior of this kind was demanded in all of the tests, and was typical 
of the will (W) factor described by Spearman. Tests of honesty and 
cooperation doubtless involve in addition to a factor of will, other 
independent factors determined by moral ideals. In some cases it may 
take more effort to be dishonest than honest. The intercorrelations 
among tests of honesty were usually low. 

These analyses use the words will, effort, steadfastness, determina- 
tion, and persistence somewhat interchangeably, and leave one with 
a vague impression of the meaning of the matliematical factor (W), 
Spearman (1938) also fails to be specific on this point. Apparently 
will, defined as effort, refers to a general discharge of energy in many 
activities including both cognitive and muscular processes. If the dis- 
charge is too feeble or too violent, it reduces g, the mental process 
which demands complex comparison. Will may also refer to persist- 
ence in the face of distractions implying a channelizing of energy to 
increase success in a particular situation which may or may not re- 
quire much of the g factor. As used by Webb and by Hartshorne and 
May, will is identified with kindness, trustworthiness, and coopera- 
tion, which are social goals rather than effort or distribution of 



664 


DYNAMIC PATTERNS 


energy. More careful definition and research are needed to clarify 
this factor. It will probably be resolved into two or more subdivisions 
with further scrutiny. 

Jl is too eail\ to write an adequate suinnidiy oi primaiy trails 
The field is being *icti\elv in\cstjgated The lollowing geneial con- 
clusions, fiowcsei, arc m order- 

1 Tlie icsulis of factoiial anahsis depend upon the variety of 
items, the anibiguits ol the items, and the vatiety of the peisons who 
lespond to the items When few persons (less than 800), and lew- 
items (less than lO) arc used, the nncstigatioii siifTcrs horn a paucity 
of facts. The best lesulis will doiibiless come fioni lacioiial analyses 
of iinambig lions items, rathei than from anahses ol total scores from 
several divisions of a test Total scoies invite ambigiiify thioiigh the 
arbitraiy selection l)v the authoi ol itein^ whicJi seem to indicate a 
paiticular trait. WJien items which aie really independent of one 
anothei aie added together, no quantitative analysis is feasible 

2 -\n inspection of the illustrations above shows lathcr slanlmg 
snriilaiincs between lesults of measincs on clifleient jiopulations 
LSjicai niaiTs g factor, and its vaiiations, inertia (P) and oscillation 
(O), arc found in all other investigations where compaiison and 

lurs 218 CO\IP\RTSO\ OF Pi-RSOX VL l.NVIZNTORILS 
Tiatt Gmljoid-Zttnmetman M\IPI 

A, Amount of do nn nance G Ccncrnl acLi\it\ — slow, 

Lji(‘c1 

A Astciul.incc Mibinjssion 
•M Mas(.iilinii\ — fcmininitv •Mf \f<isniline- 

fcminine 

B C nntrol of imfniUcs 

RiRidorlax R Rcsiiaint— caicficc 

Vjrijl)lc control •! llniotional sttihiliLN — Ma H\pom(Lnia 

fiuctiiiinon, p^Liilt 

C Self-defense oi offense 

Rationalize or sotnatic O Objcctnc — oversensitive Hv Hvsicria 

svmptoins Hs Mvpochoncliiasls 

Pl Psych.isthenia 

Blame others Peisonal lelalions — Pa Paianoia 

iintoopciative 

F rncnflliness — sadistic Pd Psychopathic 

dev Kite 

Blame self •£ Emotional stability — D Dcpiession 

fluctuation, guilt 

Withdiaw S Sociabiliiv- bhvncss Sc Schizophicnia 

T 1 hough I fulness — 
ovciacLivit) 

• These traits are more general m nature than the rest. 
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energy factors are evaluated Until the similarities have been pointed 
out, these factors are usually given other names by other investiga- 
tors. 

A group of factors underlying particular modes and mental dis- 
organization IS well described by Mosier’s work. (See Chaptei XIV.) 
One of his factors is quite like Spearman’s g, but the other seven are 
patterns of ineJffective modes of adjustments which seem to be de- 
pendent upon social experiences, health, and physique rather than 
upon intellectual activities. 

Content of Various Scales 

In order to indicate roughly the extent to which cunent methods 
evaluate the same or similar types of behavior Ulus. 218 has been 
prepared. This compares two batteries according to tliree broad 
aspects of a person, amount of energy, control of impulses, and 
methods of self-defense. These three aspects are only partly inde- 
pendent of each other. Their subdivisions are not unique traits, 
but are rather broad groupings of modes of behavior. With few 
exceptions each of the inventories yields indications of each mode, 
but the methods of evaluation are sufficiently different to make a 
different contribution to the analysis. The authors do not feel that 
their evaluations are complete. 

The difficulty of observing or self-rating intangible traits is so great 
that much more care is needed to define modes of adjustment opera- 
tionally, and then to set up and validate measures of the more im- 
portant patterns For instance, dominance is usually a function of 
strength, energy, ideals, impulses, and experiences. Thus, a young 
man may dominate a basketball court and be very retiring in an 
economics class or at a social dance His reasons for dominating in 
basket ball may be fine health and physique, or a desire to gain the 
admiration of a particular person, or a compensation for failures else- 
where. Also one must define the methods of dominating. Loud and 
continual talking, issuing public writings, and getting elected to 
offices are usually taken as evidence, but often the action of a group 
is swayed by a few well-chosen words quietly spoken by one w'hose 
judgment is respected. 


SUMMARY 

Although there has been much activity in the sampling of be- 
havior patterns by self-ratings, the careful worker must use these 
inventories with caution. They may have no value or do harm, if 
they substitute inaccurate analyses for a clear picture of the true 
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situation or if thcv give a person a false feeling of security or inse- 
curity Ihey may, ho\ve\ci, indicate cleaii) and quickly areas lor 
fiiulier nncstigation and the type of lemedial action that is needed 
More caie is needed in defining the patterns to be measuied and ihe 
\va\s of measuiing them The desciiption and mcasinenieiit of per- 
sonality lia\e been dealt with in consideiable detail by Ellis (lOlh), 
who rcMCwed 360 ai tides dealing with the \alidity ol peisoiiality 
questionnaires He concluded that gioup-administeied inventones 
resulted in \alid discrinnnatioii between adjusted and maladjusted 
groups in only about one hall of the lepoits There was little indica- 
tion tliat such questionnaiies can be used for individual diagnosis, 
because one or more of the following criticisms arc applicable 

1 The questiom aie often ambiguous, in that they may be inter- 
preted dilfeiently b\ diffeienf individuals The manner of response, 
for example, yes, 7io, or ^ has a wide lange oi iiiterpietations Vocab- 
ulary range is olten too difficult Afoieover, the questions are oltcn 
so artificial as to have little to do with real actions Forced-choice 
items may not give reasonable alrernati\cs 

2 The adDinvshntion may influence the validity, namely, the 
situation, the dnections, and the personality of the examiner For 
instance, a test given tw’o days before Chiistmas vacation had no 
relation to similar measures gnen later 

3. The insight which a lesponclent has wdth regard to his own 
intangible qualities is often not cleai, and is often biased by his w ishes 
or j:)iotecti\e leactions Some persons neaily always have incentives 
to overrate or underrate theimelves 

4. 77/6? content of most questionnaires is so miscellaneous that 
total scores have veiy different meanings lor dillcient peisons Many 
items w'hich arc used to measure a unique trait have no correlation 
with each other 

5 The tullinal factors of one group may vaiy so much from those 
of another that valid indicators in the first group will not hold for 
the second 

On the positive side Ellis points out that jieisonality inventones 
usually have some useful potentialities, such as* 

1. As research tools they allow a liigh degree ol systematic anal) sis 
and sampling, which if carried on will e\entually yield positive re- 
sults and greater internal consistency 

2 I'hc tiiith fulness wnth w'hicli an inventory is filled out may be 
detected by special lie scores and other devices, and may be inci eased 
by paired compaiisons when the two alternatives are about ecjual in 
social acceptance 

3 I’hcre is marked uniformiiy among both noimal and abnormal 
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subjects in interpreting both questions and directions. This can be 
improved by intensive study. 

4. At present, unfavorable scores are nearly always indicative of 
maladjustment, although favorable scores do not necessarily indicate 
good adjustment 


STUDY GUIDE QUESTIONS 

1. Upon what theories were the items selected for the Cornell Index, 
the MMPI, the Guilford-Martin Imentories, and the Bell Adjustment In- 
ventory? 

2. What effectiveness did the Cornell Index show when the cut-off score 
was 10^ 15? 20? 

3. Why was the reliability of the Naval Personal Inventory so much 
higher among psychiatric discharges than among normal recruits? 

4 How did Hathaway and McKinley select items for their nine scales of 
personality characteristics? 

5. How did Hathaway and McKinley secure a \alidity scored 

6 Compare the scoring categories of the MMPI, the Guilford-Martin 
Inventories, the Adams Personal Audit, and Cattell's Personal Character- 
istics 

7, Summarize the strong and weak aspects of personality inventories. 



CHAPTER XXIII 


RORSCHACH TECHNIQUES 




7'liis cliciptcr concerns tcchiiicjucs which present ink blots o[ various 
colois, and request the subject to tell what they could lejDic'scnt. 
The results aie analyzed piincipally ni tciins oL menial actuities 
winch lead to pcueptual jritegiatiori and concept lormation 7 he 
patteins of tlicse actisities and the symbolism or content o( the le- 
spouses arc used to nilei undei lying peisonahty snuctuie and lunc- 
fion 


INTRODUCTION 

Til 1921 a rcpoit by Ileiniann Rorschach, a Swiss psychiatrist, was 
published which clcscnl)ed an ciaboiaic tcchnicjue lor deteinnning 
modes of behavior lioni a jieisoii’s \cibal resjionsei> to ton ink blots 
Ii\e ol the ink blots consist of vai lous shades ol gray, two ol the 
blots are gray w'lih one shade oi red, and thiec :nc entnciy m color. 
Vaiious sliades ol red, yellow', giccn, oiange, and blue arc used 70- 
day theie aic scvcial hundred technical repoits on tlic use and in- 
terpretation ol these ink blots, and a research exchange lor publica- 
tions conceraing this technique has been established 
111 1937 Beck issued a book giving the fust lairly compi ehensi ve 
set of dncctions and inieipretations in English, and in 19i4 and 
1915 issued a 2-volume w'ork giving basic consideiations and a large 
number of cases Klopfci and Kelly (1912) issued a definitive text 
on Roischach technicpies Rappaport, Gill, and Schalei (1946) issued 
an c\tensi\e resiew with lepicscntatise case histones Elficicnt 
methods of recoiding responses have been developed so that the 
exact portion of an ink blot which served as stimulus can be idcnti- 

(HiS 
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fied, and the types of responses codified quickly. Frequency tables for 
various age, sex, clinical and other gzoups ]ia\e been published by 
Beck (1944), Hertz (1946), Rappaport, Gill, and Merton (1946), and 
Klopfer and Bavidson (1946). 

TEST ABMINISTRATION 

The technical manuals of Klopfer and Kelly (1942) and Beck (1941) 
for individual test administration and scoring are the most widely 
used. The administration usually consists of three parts, although the 
third may be omitted if the first two have resulted in siiflicient in- 
formation. In the first part, spontaneous reactions to all the cards are 
secured if possible. In the second part, called the inqiwy, nonleading 
questions are asked to determine specifically which parts of the cards 
called out each reaction and some of the mental processes which re- 
sulted in the concept formation. In the second part additional 
spontaneous responses are also elicited. The third part is the testing- 
the-limits phase. Here the examiner exerts systematic pressure by ask- 
ing leading questions to ascertain the patterns of behavior not made 
clear in the first two parts and the degrees of rejection of common 
responses. The three parts will be briefly described. 

Spontaneous Reaction 

In this part of the test Klopfer and Beck usually prefer to seat 
the subject somewhat in front of the examiner so that both can see 
the same card and the subject does not face the examiner. Rorschach 
advised this position in order to keep the personality of the examiner 
in the background. Other examiners prefer to face the subject so 
as to observe his facial expressions and to give him more security if 
it is needed. Then, the subject is handed the first card and informally 
told something like this: “People see all sorts of things in these blot 
pictures, now tell me what you see. What might it be for you? What 
does It make you think of?“ Hertz provides a subject with a trial 
blot, so that he will have become oriented to the test and the ex- 
aminer before the actual test begins. The Trial Blot (Ulus. 219) is 
not one of Rorschach's, and is reproduced here simply to show typical 
aspects of outline, shading, and white spaces. 

Since the past experience of the subject may lead him to make a 
limited interpretation of the instructions, the examiner will correct 
and encourage him as needed. Thus some subjects who indulge in 
free association without much regard for the ink blot in hand are 
asked to limit their responses to what the card might be. Others who 
only describe the blots on the card are asked to tell what the card 
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ILLUS. 219. HERTZ TRIAL INK BLOT 



(Designed by Marguerite Hertz and published with her permission.) 

makes them think of. Subjects are not instructed to make up a story, 
and few stories appear. 

Many subjects ask questions such as, *‘Am I to look at the whole 
card or to pick out parts of it, or am I to imagine things?” The ex- 
aminer makes a noncommittal answer such as: ”It doesn't matter; 
just tell me whatever the card brings to mind.” When the examiner 
is asked whether the card may be turned, the answ^er is, *lt’s okay 
to turn the card.” Subjects are not instructed to turn tlie card. 

When a subject has given two or more fairly complete associations 
with a card, he may be allowed to relinquish it since the material 
necessary for a significant score has probably been obtained. If he 
rejects the card without any responses, or with only a single fragmen- 
tary response, as in some shock or depressed cases. Beck encourages 
more responses by saying, “Take a little more time, most people see 
more than one thing.” Sometimes subjects will continue to make 
associations with a card almost indefinitely, so that the examiner 
must give him another card after a period which has been adequate 
for sampling the subject's responses. Beck suggests 10 minutes, but 
Hertz recommends 2 minutes for most subjects. 

Although no time limits are definitely set, the examiner is to keep 
a record of three periods for each card: first, the reaction time be- 
tween taking the card and giving the first content responses to it; 
second, the time between taking the card and finishing the last con- 
tent response; third, fairly long intervals between responses. The 
examiner should note the reasons for these intervals. By adding \hf? 
response times the total time is secured. 
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Since spontaneous reactions are believed to have a significance 
somewhat different from that of nonspontaneous reactions, a nearly 
verbatim record must be taken of what is said by both the subject and 
the examiner, and any unusual behavior is recorded. Several standard 
record cards are now available with detailed summaries or ratings of 
responses (Ulus. 220), 

The Inquiry 

The inquiry must determine the where and how of the responses. 

A general question such as, *‘Where is the ?" will 

usually draw out an answer to show whether the subject used the 
whole or virtually the whole card or selected only a part of it Some- 
times the subject may be asked to outline with a wooden pointer the 
figure he has seen on the card, or to draw it with a pencil on a special 
location chart. Each Rorschach card has now been charted and coded 
with numbered areas. For instance, Klopfer and Kelly (1942) show 
six large details and six small details commonly found on Card I. 

During tliis period any spontaneous additions to the original re- 
plies must be reported carefully. A good deal of importance is also 
attached to the omission of parts of cards. These omissions throw 
light on the ability of the subject to organize and on his rejection of 
certain details. 

In determining how the concept was formed, the examiner must, 
without the use of leading questions, get clear information with re- 
gard to color, motion, form, shading, and other aspects Usually the 
question, “What is it in this card which makes you think of a 

?” is sufficient. In other instances the examiner must 

ask additional nonsuggestive questions to get the subject to give a 
complete picture. 

Testing-the-Limits 

If the two previous parts of the test are well done, some examiners 
feel that testing-the-limits is unnecessary and diat it may prevent 
adequate retests later. However, experience accumulated in the last 
few years has given more importance to this phase, because it has 
often been found that subjects, particularly psychopathic subjects, 
will not give clear and complete accounts of concept formation with- 
out a systematic series of direct questions. The examiner must find 
out why a subject has failed to respond to obvious features, whether 
his failure was due to embarrassment or oversight, and specifically 
what prompted the subject’s responses when he made only frag- 
mentary answers For instance, when the subject always uses parts of 
a card and never the complete card, he may be asked to respond to 
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the whole caid Or, where no color has been mentioned he may be 
specifically asked to respond to colon Wliere no human figures have 
been seen in action, instructions may be gi\en to try to find such 
figures Sample concepts may e\en be guen to get the subject started 
along fanly obvious lines. 

Hutt and Shor (1946) use twent\ common i espouses listed by Beck 
(1944) to suggest those not already leporied by the subject. Three 
levels of suggestion are used On the fiist le\el the examiner says, 

''Some peoj^le see in this card. Can you make it out?'' 

If the subject tails, then the examiner sa\s, “Well, this is wheie they 
see it," indicating the aiea It the subject still tails, the examiner says, 

"You sec, here aie the and indicates some of the 

details Hutt and Shoi also tell the subject, “Divide the cards into 
those you like to look at, those you don't like, and those about which 
you don't care one way or the other." Reasons tor these pielerences 
aie sought. 

Group Rorschach Techniques 

In order to reduce the time and labor of administering and scoring 
a test, several workers have developed procedures tor giving modified 
Rorschach tests to groups. Piobably the best known is that of Har- 
lOwer-Eiickson and Steiner (1945) who give three sets of instiuclions 
(p 33) as follows* 

INSTRUCTIONS FOR SPON TAN LOUS ANSVVI RS 

The test which you are about to take is rather an interesting one and I 
think you will enjoy it All you have to do is to look at some slides which 
will be projected on the screen and write clown v\hat jou see. Now the 
point about these slides is that they are nothing more or less than repro- 
ductions of ink blots Probably all of you at one time or another have 
shaken jour pen on a piece of paper, caused a blot ol ink, and on folding 
the paper produced a weird splotch which may or may not have resembled 
something that you recognized Now these slides are nothing more than 
reproductions of inkblots formed in diis way Your task is simply to write 
down what the splotches remind you of, resemble, or might be You will 
see each of these slides or blots 3 minutes, and you may write your answers 
at your own time Is that understood^ It may help you later in the test if 
you make a point of numbering your answ^ers to each slide as you write 
them dowm 


INSl RUCTIONS FOR THE LOCATING OF RESPONSES 

Well, this IS the first part of the experiment Now we shall go on to the 
second. I’m sure you will have seen a lot of amusing and different things in 
the various ink blots, but one of the most important aspects of this test is 
the fact that I must know as accurately as possible just what it is you have 
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seen and where it is you ha\e seen it. In order that you can do this, you 
will find on each page a little diagram representing the slide 

At this point the slide of Card I is thrown on the screen and the ex- 
aminer continiies* 

Xow jierhaps some of )«« saw on this panKiil.ir slide a Inirterfl), and 
then pciliaps \ou also sjw me legs of sonic person in the cciitei litre and 
peihaps a bo\ing glo\c in this little proiuhciance hcie oi a dog’s head 
heie on die side [While speaking o( these objects the examiner points to 
ihe areas leferrtd to which are tiifiicled by a dark line on the ’ilidc ] Your 
ncM task, durclore is to iiunibei yiur own answeis, if }ou loigot to do so 
bclore. and then wirh sour pencil to diass a lino around the aiea whore you 
saw that p.irticulai object and attach to that aica the number oi the answer 
sou are desctibiiig For example, let us sup])ose you base seen just those 
four things which I meniioncd You svoulcl put a nunihci 1 by “a hiiiL<‘ifly,’' 
cliaw a line all the way around the imniature ink blot and put a number 
1 beside this hue If “somebocl) 's legs” was sour second aiisssci, you would 
number that 2, draw a careful pencil line around the arc.i on the diagram 
and auacli a number 2 to it In odier w'oicls you will do lor all your osvii 
anssscis svhat has been done for these h)pothetiLal answcis on the scieen 

INSTRLCTIONS I OR OBI \IMNG ADDID IXTORMAIION 

After the instructions concerning the iccortlmg oi the location of 
responses Imsc been given, the slide of Caul \’III may be thiosyii on 
the screen and added infoimation confCiiiing fhc responses may be 
asked for Jiisti actions at this point aic 

Before sou begin to mark ofl your answers, there is something else you 
have to do for me You hasc to help me rcconsuuct as aciiiratels as possible 
the kind of experiences )ou ha\c been having oi some of the cIiaracLerisLus 
of die dungs \oii saw \oii might for nisiaiicc, Ime s(‘cn two bears or two 
animals here on die side You might ha\e seen two Hags here in the center, 
or )ou might ha\e c.dlcd these same parts tw’O cushions Ihis part here 
(pink and orange) might ha\c leininded >ou of some kind ot flow'er 

Some of yon may have said for example, that the bears looked as if thev 
were climbing up, but it is also very possible that you did not put in that 
last bit of iiilonnaiion Now is your chance to do so if you want to If you 
want to explain to me that the animals vou saw looked as il they w'ere 
stepping fiom one lock to another, )Ou may add that information now 
But perhaps )0U did wo/ see them as if they were stepping line* lhat is 
just as jiiipoitant Perhaps the) looked to you as if they wctc some kind of 
animal on a heraldic design and you ina\ have already said so In that 
tasc YOU will not need to give any more inloimation 

Let us suppose that you not only saw cushions here but saw blue satin 
cushions In this case vou would again anijildy your answer because it is 
imj)ortant for me to know whether vou got the iinjnession of the satiny 
or silky feel of the cushion, and whether you were impressed b) us blueness 
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Again this area may have reminded you of a flower because it was the 
color of the sw^eet peas in your backyard. If it was the color that attracted 
your attention and made ^ou think of those swTet peas, then add this 
information by writing in the tvord color 

After the instructions have been given and after any pertinent 
questions have been answered, the slides may be pi ejected again in 
the usual order. Each is shown appioximately 2 minutes. 

Self-recording techniques such as this have not as yet given as 
significant lesults as the clinical testing approach, but those who 
have used them believe that they deseive further exploiation These 
directions yielded booklets which were in some cases complete 
protocols However, if the results are to be complete and accuiate 
the subject must be able and willing to cooperate fully. 

In another set of directions, the subject is asked to check multiple- 
choice items from Harroiver-Erickson and Steiner (1945, p. 255) as 
shown in Ulus 221. 

ILLUS 221 MULTIPLE-CHOICE RORSCH\CH ITEMS 

You are going to see ten ink-blot pictuies one after another. 

Begin by taking a good look at Ink Blot I and see if it, or an> part of it, re- 
minds you of anything or resembles something you ha\e seen 

Then lead through each of the three groups of answeis for Ink Blot I (A. B, C). 

Now underline the one answer in Group A, the one answer in Group B, and 
the one answer in Group C, which you think is the best desenption of that 
ink blot or any of its parts You, therefore, underline three answers for Ink 
Blot 1 

When you have done this, if you wish, you may put a check beside any other 
answer in any of the three groups which you also feel is a good description of 
the ink blot or any of its parts 

Then do exactly the same thing for each of the other ink blots. 


An army or navy em- 
blem 

Crumbling cliffs 
A bat 

Nothing at all 
Two people 
A pelvis 

An X ray picture 
Pinceis of a crab 
A dirty mess 
Part of my body 
(By permission of Mary R 


INK BLOT I 
B 

A headless figure 
Vertebra 

Tmy boxing gloves 
Spilt ink 

Someone’s insides 
Nothing at all 
A butterfly flying 
Lava 

A coat of arms 
An X ray picture of the 
chest 


A Halloween mask 
Storm douds 
A moth 
Two people 
A bell m the center 
An X ray picture of the 
spine 

Animal heads on the sides 
The stomach 
Nothing at all 
Eyes glaring at me 
Harrower-Enckson and the Charles C Thomas Co) 


In the simplest scoring procedure the total number and per cent 
of poor first choices are found. In a comparison of 329 normal adults. 
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225 prisoners, 53 students referred to a psychiatrist, and 143 mental- 
hospital patients, it was found that only 3 per cent of normal adults 
had 40 per cent or more poor answers, while 15 per cent of prisoners, 
30 per cent of students referred to a psychiatrist, and about 65 per 
cent of mental-hospital patients had 40 per cent or more poor 
answers. 

In a more detailed scoring procedure the first and the second 
choices are both considered and the significance of each scoring 
category is weighed. The Multiple-Choice Rorschach is still being 
developed. Ihe selection ol items and their significance lor vaiious 
beha\ior patterns will be studied a great deal in the next few years 
If the instuictions aie vaned by asking the subject to list in oidei ol 
piefeience all the answeis which he finds acceptable, then a profile 
and detailed interpietation can be made, using the symbols lor in- 
tcipieting uuhvKliial Rorschach tests which are discussed below 

SCORING 

There are two jirocedures for scoring first, an analysis of quality 
or pattern of responses, and second, a standardized rating of quality 
and amount ol responses, using codes Not all responses aie scored 
Some are meiely repetitions, w^ith slight changes, ol responses already 
given Others aie ex ti ancons material in the way ol elaborations, ac- 
cessories or exclamations, which influence the scoring ol other re- 
sjionscs Ho%vever, these accessories or afterthoughts are noted as ad- 
ditional to the main concepts The diffeiences between main and 
additional scores aie believed to be ol considerable significance in 
many situations The main scores may indicate ready-to-function 
elements of the peisoiiality while the additional scores may indicate 
potentialities "Ihe determination of which responses are scorable 
lequires considerable experience 

About ninety Rorschach codes are defined and indicated by a 
letter symbol followed sometimes by plus or minus Letter combina- 
tions are often used with the first letter indicating dominant re- 
sponses and the second subsidiary responses The codes from Klopfer 
and Kelly (Ulus 220) are fairly well established Three general cate- 
gories are recognized in scoring, namely, location, determinants, and 
content. 

Location 

The letter symbols used in this category indicate the location of the 
parts of the card that are reacted to or ignored. Location symbols 
yield scores which indicate degree and order of organizing processes. 
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These processes are believed to be related to intellectual and reason- 
ing abilities, but are also molded by emotional adjustment patterns. 

Determinants 

The dominant percepts which guide the subject’s in terpi elation 
aie called deteunmantSy which are thought to indicate sticngth of 
impulses and degree of control 

Content 

Specific objects, when related to the subject, seem to pi o\ ide indica- 
tions of the nature of a conflict and various mechanisms used to 
resolve conflicts. The scoring symbols are numerous In Klopfer’s 
procedure the propoitions of j^opulai responses (P) are scoied by 
using frequency tables. 

Indices 

A large number of ratios or indices aie now used to indicate bal- 
ance and emphases (Ulus. 220). They must be learned from the texts 
on this subject, but three of them — approach, sequence, and ex- 
perience balance — are given as illustrations: 

An approach index is concerned wnth location or organizing abil- 
ity. It shows to what extent an individual distributes his attention 
between W, D, and Dd Rorschach found aveiages of 7 W, 20 D, and 
3 Dd among normal individuals who gave thiity responses Accord- 
ing to Beck (1 944) the expectancy based on normal adults has the 
following percentages. W, 24 per cent, D, 66 pei cent, and Dd, 10 
per cent. Klopfer believed that an individual has not over- or under- 
emphasized any location if he shows from 20 to 30 per cent W, from 
45 to 55 per cent D, from 5 to 15 per cent d, and up to 10 per cent 
Dd and S. Marked variations from this distribution, together with 
other findings, are often found to have clinical significance. 

In addition to approach, an index of sequence is sometimes used 
This lefers to the order in which the location categories are used on 
each card. If a subject gives only one response to a card, there can 
be no indication of sequence. However, if on the first card his re- 
sponses are W, D, d, Dd, S, and it he follows this same order on all 
the other cards, it w^ould be noted that he has a rigid sequence pat- 
tern. Other sequence patterns are designated as orderly, loose, or 
confused. 

The experience balance index was described by Rorschach, who 
followed Jung’s extroversion — introversion hypothesis but developed 
somewhat more dynamic concepts The introverted person is more 
original and productive in mental life, more stabilized and organized 
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inwardly^ more intensive in his relation to others, and more clumsy 
or awkward. Such persons usually give few or no movement or color 
responses, or their movement responses are considerably more nu- 
merous than their color responses. The extroverted persons are more 
stereotyped mentally, more in touch with reality, more excitable and 
unstable emotionally, and more dextrous Such persons usually give 
considerably more color than movement responses. 

A General Rorschach Score 

A number of j^ersons have combined Rorschach signs into gen- 
eral indices of integration ol adjustment. One by Munroe (1945) is 
discussed on page 687. Another is the systematic work of Drs. Char- 
lotte Buhler, Karl Buhler, and D. \V. Lefever (1948). Using individ- 
ual examinations and the Klopfer-Kelly Scoring System, they com- 
pared statistically each of ninety-nine diagnostic signs among five 
groups: 30 \v'ell-adjusted noinials, 70 psycho neurotics, 39 hysterics; 13 
anxiety neurotics, 1 1 mixed psychoneurotics; 7 obsessive-compulsive 
and 8 depressed cases; 50 psychopaths, 30 organic cases, and 27 
schizophrenic cases. Fiom these comparisons a single summarizing 
score was developed (Ulus 222). Later the research included 518 cases 
and 16 clinical gioujDs. 

The results for these groups were often different. Weights of from 
—5 to +5 were assigned to each Rorschach sign in accordance with 
the diffeiences in the per cents of persons in two ol the groups who 
showed die sign For instance, 93 per cent of normals and 46 per 
cent of psychoneurotics gave sign 94 Three oi more M*s and Sum C 
equal to 3 or more. This sign was then given a weight of +2 in ac- 
cordance with a table of wreights of degrees of significance. Since 
each group was compared with the other four groups, ten compar- 
isons weie made and ten sets of weights computed. The ten sets of 
weights gave similar results, but the weights from the comparison 
of normals versus schizophrenic patients produced the greatest ratio 
for the variance between groups to the variance within groups These 
weights were, after some slight modifications to give more signifi- 
cance to rare but important signs, called basic Rorschach weights 

To secure a score, an individual record is analyzed and recorded 
on a score sheet which shows the positive or negative weight for 
each Item- The algebraic sum of the weights is the basic Rorschach 
score. The means and standard deviations for clinical groups given 
in Ulus. 223 shows a fairly constant trend on a scale from normal to 
badly disintegrated. The ranges of scores are larger for minus than 
for plus groups, but this may be due to sampling and to the fact that 
the scale contains more negative than positive items. Even for these 
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ILLVS. 222 MEAN PROFILE FOR VARIOUS GROUPS 



(Buhler, Buhler, and Lefever, 19-18, pp. 20, 21 By permission of the authors and 

publishers ) 
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ILLLS 223 NORMS TOR CLINIC VL GROUPS ON B\SIC 
ROR;>CJIUJI SC ORIS* 
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* (From Riiiilci, lUililct, and Icfc'icr, 1918 B> pcimission of die aiitliois and 
puljlishcis ) 

small sain])lc-> the dilTeiences between means are lemarkably signifi- 
cant loi cliireiciit cluneal gioups 0\erlapping ol groups is shoun in 
nhis 22J, \vhcie laiger clinical caregoiies aie plotted aecoidmg to 
levels Le\el I is called adequacy, level II, conflict, level III, inipair- 
meni, and level IV, icality loss The dividing lines between these 
levels aic tentatively set at -j-la, 0, and — 15 j^oints on the basic 
Rorschach scoi c. 

I'he authois think ol these scoies as indicators ol the degree of 
peisonal integration, and point out that such indicators ol ability 
“to master conflict situations and meet reality" aie uselul iii addition 
to the clinical classifications They believe that such scores are aids 
to prognosis They also eniphasi/e tJiat the psychological siruciure 
of one level dillcrs widely horn that of anothei, and that integration 
is not a one-dimensional function "Ivvo pioniinent dimensions or 
fields of conflict appear (1) a conflict between immediate and de- 
ferred goals, and (2) a conflict between reality awareness and imag- 
ination Thus, in level I tendencies to act aie determined by ade- 
quate aspiration and aw’areness of reality In level II there is a severe 
conflict because of unreasonable aspiration, but no loss of reality 
awaieness In level III there is little control of impulses and consider- 
able loss of reality awareness, and in level IV there may again be 
control of impulsive activity, but contact with reality is almost en- 
tirely lost 
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ILLUS 224. DISTRIBUTIONS OF GENERAL RORSCHACH SCORES FOR 

VARIOUS GROUPS 
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(Buhler, Buhler, and Lefever, 1948, p 14 B) pcimission of the authois and pub- 
lishers ) 

Additional monographs are proposed which will define these and 
other dimensions oi personality, and explore the dilTei cnees be- 
tween clinical groups more thoroughly. 


Anxiety and Hostility Scores 

Eli/ur (1949) reported a method of analyzing and scoiing Ror- 
schach piotocols so as to yield tw^o indices — one lor anxiety, another 
for hostility He used only the content of responses and called the 
results a Rorschach Content Test (RCT) score Anxiety is shown 
(Elizur, 1949, p 259) by such responses as* 

A frightening giant, a weeping child, a dangerous crevice, darkness 
and gloom, a girl escaping, a rabbit running away, snakes, monsters, 
witches, skeletons, blood, clouds, fire, smoke, twister. 

<2 1 An unpleasant animal, an unbalanced figure, an impiession of cold- 
ness, spiders, mosquitoes, church priest 

Hostility IS judged to be shown by such responses as- 

H A type of man I hate; ugly figure, a stupid animal, an angry face, 
tw’o animals fighting, a killed animal, arrow, gun 

h. Gossiping women, a primitive war mark, pliers, knife, and teeth 

iThe capital letters indicate strong involvement and the lower-case letters, 
less involvement 
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The scores are sums of responses alIo%ving 2 points for each re- 
sponse sho*\ving clear or strong evidence, and one point for each re- 
sponse showing a smaller degiee of in\ohement Scoring by eight 
relatively unuained students took about 5 minutes for each Ror- 
schach letord and yielded an a\ciage intercorrelation of 77 for 
anxiets stoies and 82 lor hostihtv scoies Coi relations between the 
two scoies langed Irom 1 1 to 39 on small groups 

In a small Stunjile the men and w'omen had nearly equal raw score 
averages for anxiety (10 9 and 10.3), and the men aveiaged slightly 
moie than the w’oinen foi hostilits (8 and 6 7) 

In order to show \<didity, the A and H scores Tvere correlated with 
results of questionnaiies, self-iatings. and interviewer ratings on 
seven chaiacteristus anxiety, dependency, hostility, submissiveness, 
aloofness, ideas of lelerence, and depiession The correlations ranged 
from aji])roMmaLely 50 to 70 among the various appraisals of simi- 
lar traits 1 heie was a small positive ton elation (about 20) between 
a short veibal intelligeme test score and the RCT anxiety and hostil- 
ity scores, and a small negative correlation (about — 28) between age 
and the RCT scoies Eli/ur also lepoited qualitative aspects of be- 
havior, such as one's attitude towaid his own anxiety, the relative 
amounts of tensions as shown by the proportion of strong and weak 
responses, the sequence, emeigence, and recovery from anxiety, and 
sex responses. Lastly he computed RCT scores for published Ror- 
schach records of neurotics and normals The neurotics averaged 
16.5 (average of A and H) and the normals 8 8. 

Elizur points out that the RCT scores are faiily reliable and objec- 
tive, do not require familiarity with usual Rorschach procedures, and 
also yield t[uantitative and qualitative results He also indicated the 
need for much moie lesearch on larger samples. 

INTERPRETATION 

Authorities agree that at least two years of careful study are needed 
to develop adequate skill in interpreting Rorschach scores A few 
broad interpretations aie given here to show the types of inferences 
that are supported by many studies 
Location scores indicate the individual’s manner of approaching 
life situations Rorschach described persons who gave only complex 
and ingenious W responses as superior in intelligence and drive 
Whole responses indicate capacity for abstraction if there is original- 
ity and form accuracy. Many whole responses of a vague sort point to 
superficiality and to drive without capacity. An absence of W re- 
sponses may mean lack of ability to generalize, but may also indicate 
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that the detail approach is more typical of the subject's mode of 
meeting new situations. An excess of Dd, that is, tiny, or unusual 
details, is frequently related to anxiety, overcritical activities, and 
feelings of inferiority. Unusual attention to edge details may be re- 
lated to need to avoid the inner aspects of self, and predominance of 
details inside the blot may be related to preoccupation with one’s 
inner life. Attention to white spaces often reveals opposition either 
toward self or toward others Fragmentary responses, such as wheie 
one sees only part of a butterfly when most people see the whole, are 
typical of the feeble-minded, the depressed, and those who have 
severe difficulty with perceptual synthesis. 

In interpreting determinants Rorschach and nearly all of the sub- 
sequent w'riters in this field assume that persons have basic impulses 
or drives from within which are controlled consciously or uncon- 
sciously to varying degrees and by different mechanisms. Some per- 
sons exercise rigid or stifling control of these impulses, other persons 
let them explode Normal persons show fairly strong impulses, which 
are accepted as natural and directed into harmless or useful activities. 
The Rorschach test attempts to indicate both the strength of the 
inner impulses and the degree and manner of controlling them. The 
normal use of controls is indicated by a good balance between normal 
drives as shown by M, human movement, and adjustment to the 
environment, as shown by F, form and content. One type of repres- 
sive control is indicated by an unusual proportion of F responses. A 
preoccupation with form is an indication of a rigid and unreasonable 
or obsessional attempt to control impulses. Any percentages of F+ 
above 80 are usually found to indicate pathological conditions among 
adults Unusually good form responses to complicated patterns in- 
dicate high intellectual efficiency. Poor fonn accuracy is typical of 
organic pathology, feeble-mindedness and schizophrenia, and other 
conditions where details are synthesized into concepts with great dif- 
ficulty 

Usual movement responses indicate a fairly normal attitude to- 
ward instinctive drives, acceptance of them as something positive 
and constructive rather than as uncontrollable forces which con- 
stantly interfere with one's success Many more than average move- 
ment responses are associated with inventiveness or fantasies and 
rare or no movement responses with depression and conflict or feeble- 
mindedness, or with young children. 

Color responses indicate emotional involvement or intensity of 
emotional relationships to objects and people. Some neurotic sub- 
jects respond to color either with exhilaration or with shock. The 
normal response for many cards is FG, that is, form with a supporting 
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color determinant. CF, that is, color with a supporting form deter- 
niiHiint, indiCtiies C‘\ciicibilit\, and sonictiincs siiggcMibilii), oi in- 
fantile emotional coiuiol \n unusual niinibci <:1 color i espouses is 
often associated uith egocentucits and intense iinpulsiscness -V 
dominani or excessive roloi reaction is usualls indicative of niaiked 
inabilitv to coniiol inner impulses, and a tendeiicv to substitute im- 
aginarv situations leu leal situations. 

Responses to shadini* ( R) imply indefinitencss or atiention to small 
difTeiencts in lighting. wJiidi is olten associated with anxiety, repics- 
sion, and caution, sonictiines vague and soinet lines specific. Texture 
and toiich-lccling leactions aie indicative of sensitivity both in inicl- 
Icctual and in emotional fields A iioimal ninnbei of texture le- 
spoiises arc essential foi well-b.ilanced behavioi 

The interpictation of spec die -assoc lai ion content must take into 
accouni the backgioiincl ol the cxaminet* I'hus, an unusuall) laige 
TUimber of releiences to human anatomy might be noimal for a 
suigeon, but it might be indicatue of a moibid tendency in a bakei 
or a stcnogi a|)licr rrecpiently repeated i espouses may indicate stei eo- 
typv, as in fcebic-inindedncss, oi a spec lal interest, or a lecuTiing leai 
'Ihe minibei of uncominoii lesponses is indicative of cither oiigin.d- 
itv or oi unusual usage ol usual words Bieadth ol mtcicsts is lotighly 
indicated by the number ol diflcrent fields icprcsented in the rc- 
poi ts 

A good many aiithois have now published piofiles ol Rorschach 
scoics of which those sIiovmi in Tlliis 220 and 221 arc typical Both 
of these use Klopici s svstem ol scoring It appears dial the noimal 
gioup aveiages inoic total icsponscs than anv oi the otheis and estab- 
lishes a setoi ratios which are iinportant indicaiois cfl balanced modes 
of behavior 1 bus the avciage M, IM, and fC lesponscs aie all about 
the same (neaily 8) v\hile the F aic almost twice as great (15) Then 
there aie important symbols ol sensitivity and raiitioii— K, Kc, C', 
CF, and C — with between 3 and 1 i espouses each. Foi the h)i>tencal 
gioup the F\1 IS about the same as foi the normal — the M, FC. and 
F aic considciably reduced, while I he CF and C aie inci eased signifi- 
cantly for the obsessive-compulsive gioup the FAf and F remain 
about normal, but die M and FC arc much i educed. For the depres- 
sive-neurotic gioup the F.\I icmains nearly normal, but all the rest 
arc only about one half noiniaJ cxpectancv Foi the paranoid psy- 
chopath marked reductions occur in about the same pioportion in 
all faccois. tor the alcoholic ps)(hopath theie is a lelatively gieater 
reduction in the Af and FC factors than in the rest Foi two groups 
with organic (brain) injuries, the profiles show gieat reduction of 
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all types of responses, but the greatest in M. The FC is a particularly 
difficult response for the psychotic-organic group. 

This rough inspection of profiles shows that a great deal may be 
learned by close attention to small differences, once these differences 
are determined to be significant. At present the individual variations 
from such profiles are sometimes large and always important, so 
that no individual diagnosis is ju‘»cificd on the basis of profiles alone 

All Roischach i\oikeis einph«isi/c tlie gicai impoi lance ol intei- 
prcting the responses ol a person in terms ol his whole personiility 
picture No ijgid intei pi oration ol one cleLcuninant is ])ossil)le, each 
determinant must he accuiarelv ajipiaiscd and i elated lo all the rest 
Complete iccords ol (jcisons who aie lanly t\pical ol da^^cs of be- 
havior, such as hcaltln adulis, tlio Iccble-mindcd, the depiosscd, 
hypoinamcs, ’»chi/ophicnirs, neurotics, adults w'ltli conduct clisoi- 
deis, problem children, and mental hvgiene cases aie now available 
from many sources. 


RELIABILITY 

Reliability has been indicated by three main appioachc^ In one 
the variations among examiners in procuring and scoring protocols 
is noiccl. In another the results fiom repealed applications of a test 
to the same Jiidividuals are showm In still another the abilit) of ex- 
aminers 10 agree upon interpretations is inclicat€*d 

Cerf (19‘1()) reported significant diifeicnccs among nine examiners 
in the number ol responses obtained from An Oirps cadets The 
highest averaged 24 3 ies]>onses and the lowest 1 1 fi, median 20 5 
These cliffeiences arc iai too large to be explained by chance and 
probably indicare cliff erences in the attitude, naming oi skill of the 
examiners 

The scoring of a Roi'schach protocol is still lar Irom simple It 
depends partly on the examiner's owm knowledge and bias, and 
partly on the completeness of the pioiorol -Vmong expeiicnced ex- 
amiiieis tiaincd in the same method, the coircspondcnce in scoring 
usually IS high and the diffeiences arc thought to be insigiiificanl. 
The use ol scoi ing charts and more detailed definitions wull doubt- 
less raise these reliabilities. 

7 he icst-rctcst reliability is Caiily high for signs wdiich have the 
highei frequencies, but for rare signs the consistency between trials 
is, of course, lower The Basic Rorschach Scoic, which is made up ol 
about one hundred signs or indices, has a high split-half reliability 
and wmII doubtless show satisfactory iciest reliabilities, aldiough none 
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have come to hand as yet Ford (1946), using 123 young children, 
found testretest correlations of from .38 to .86 for various patterns. 
Fosbei^ (1943) reported that attempts to fake results by test-wise 
subjects were not successful in altering any fundamental patterns 
and that attempts to misrepresent oneself were not successful among 
nai\e subjects. There is considerable need for more research along 
these lines, and also need to determine the effects of the subject’s 
mood at the time he takes the test. Kimble (1945) found that sub- 
jects tended to be more extroverted when examined in public. 

The ability of examiners to agree upon interpretations was in- 
vestigated by JVIunroe (1945), 'who compared eleven examiners. They 
used the ’’inspection technique” to classify independently eleven 
w'omen college students’ records. These records had been procured 
by the group-administration method w^hen the students were asked 
to write their impressions in booklets. The examiners simply ranked 
them in four groups in order of degree of disturbance. Under these 
rather adverse conditions about 43 per cent of the judgments agreed 
perfectly, 30 per cent differed by one rank, 12 per cent by two ranks, 
and 15 per cent by more than two ranks. With this amount of agree- 
ment from such data it can be seen that if the first four steps of the 
Munroe scale (Ulus 225) were used with more adequate instruction, 
the agreement between judges should regularly be more than 90 
per cent. 

USEFULNESS 

The evidences of validity are to be found in comparisons of Ror- 
schach results with other criteria of adjustment, such as degree of 
school and occupational success, delinquency, and psychoneurotic 
manifestations. In all of these fields the evidence still leaves much 
to be desired, but available reports aie significant enough to make 
Ronchach results of considerable use A few reports will be discussed. 

School Groups 

Davidson (1943) found that children, nine through twelve years of 
age, with high intelligence ratings showed a wide variety of person- 
ality patterns. They tended to be well balanced but more often in 
an introverted than in an extroverted way. There seemed to be no 
relationship between socio-economic status and Rorschach patterns, 
but degrees of good adjustment were reflected reliably by a group of 
signs. 

Hertzman and Margulies (1943) found significant differences be- 
tween junior high school boys and college men in line with usual 
developmental changes toward more control of impulses. 
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ILLUS. 225. MUNROE*S RORSCHACH GROUPS 

A. Unusually sound integration of the personality. Emotional problems either 
very mild or veiy well handled: 

Case 1 A girl with fair intelligence, con\cntional, practical, a little too con- 
scientious, with some restraint of imagination and spontaneity. Emotional re- 
sponse eness IS basically sound and of adequate range. (Rorschach items which 
deviated somewhat fiom the normal range low \\\ many good D, form accuracy 
good, 3M, 1 popular, 2 with constrained action, S common FM; 5 FC and 2 vague 
CF; succession rather rigid ) 

B. Emotional problems observable, too slight to affect behavior markedly or 
cause serious inner discomfort. 

Case 4 Immature, expansive girl Unsystematic, rather caieless. pioliably socially 
oriented and not much interested in studies. Moderate intelligence. Warm, lively, 
but rather superficial (Rorschach items* I popular M, 3 lively FM, 3 FC and 
14 CF, succession loose, Hd and Ad overemphasized; but a fair number of H and 
A are given; 50 per cent W, popular oi rather vague; form accuracy adequate 
but inexact) 

C. Emotional difficulties rather maiked, very likely to affect attitudes, interests 
and perfonnances, but not to an extreme degree 

Case 6, A girl whose adecpinte intelligence is poorly used because of great 
timidity and self*doubt Passive, submissive attitudes Tries hard but is vulnerable 
to criticism and easily discouraged. Strong hostility which is largely unconscious 
and a source of anxiety, preventing free mobilization of energies, but not serious 
enough to cause overt symptoms (Rorschach items* predominantl> flexor or pas- 
sive movement with occasional vigorous lesponses of disguised hostile content 
followed by evasive F or K, mild color shock. CF, FC, usually vague or pretty but 
^'fire” m II, overemphasis on W, often insubstantial; several S; form accuracy not 
bad but often vague or restricted, F per cent moderately high.) 

D. Serious difficulty in meeting reality demands adequately, or marked inner 
distress or both. 

Case 7 Very intelligent, creative girl wuth serious emotional problems which 
must markedly affect her work and relations w'lth people. Extremely self-absorbed 
Thinking original, lively, resouiceful, but too often inaccurate oi at least highly 
selective in choice of facts because of a tendency to project her own ideas or emo- 
tions into the data Erratic in work habits, resistant to authority, brusque or aloof 
in social lelations (Rorschach items, almost ever> response contains some sort of 
movement, many originals, some verging on the bizarre with clear symbolic con- 
notations, lack of Fc or FK, definite color shock; few color responses, explosive CF. 

E Severe psychopathology Not discussed in Munroe’s report. 

(By permission of Ruth Munroe and the editors of Applied Psychology Mono- 
graphs) 

Munroe (1945), working with college-freshman women, secured a 
Rorschach Adjustment Rating by inspecting about twenty-five signs 
indicating unbalance or mental difficulties and grouping the results 
qualitatively into five ratings Samples of the first four ratings are 
shown in Illus. 225. She found that 74 per cent of the academic grade 
averages at the end of the freshman year were well predicted by these 
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adjustment ratings, and that students who failed or had dfficulty be- 
cause of emotional conflicts were detected almost 100 per cent The 
adjustment ratings did not predict very closely the academic suc- 
cess of the highest third of the group, but these were well predicted, 
as might be expected, by the ACE Psychological Examination It 
was shown that poorly adjusted girls who achieved better than aver- 
age giades all stood well above average in the ACE Psychological Ex- 
amination With few exceptions the well-adjusted girls who did 
poorly academically stood in the lowest test quarter of the ACE 
Psychological Examination. Thus the adjustment ratings and the 
test scoies supplemented each other and together gave accurate pre- 
dictions of success. 

Occupational Groups 

Piotrowski and others (1944) compared outstanding young male 
mechanical workeis with average workers and lound significant dif- 
feiences m several Rorschach signs Kmu (1948) reported a carefully 
conti oiled apj^hcation of the Rorschach test to life-insurance sales 
managers Two competent Rorschach experts used the results of in- 
dividual tests of forty-two satisfactory and thirty-eight unsatisfactory 
sales managers to develop a scoring system. About thirty-two signs 
or combinations of signs occuiied more frequently in one group than 
in the other Seventy-nine of the eighty managers w^cre correctly 
classified by this system Later, w’hen the same process was tried on 
another sample of managers, twenty good and twenty-one poor, the 
Rorschach results gave only a chance indication of success. At the 
same time a short Experience Record Form which has been fre- 
quently used in this field, correlated with success .48 

Kurtz also quotes from six other studies using group Rorschach 
results with adults in the military forces or in civilian occupations. 
All of these studies show that success in one or a number of occupa- 
tions was not significantly predicted by the test scores used. 

Cerf (1946) reported the application of both individual and group 
Rorschach tests to candidates for pilot training Predictions of suc- 
cess or failure to complete the course were not usually significant 
either from single signs or from various combinations based on ex- 
aminers’ judgments of what might be most significant. The highest 
biserial corrected correlation was .26 among 281 pilots, 92 per cent 
of whom graduated For such a highly selected group this is fairly 
significant. 

In spite of these usually meager results, the value of the Rorschach 
is being actively explored in mdusti 7 , and it is highly probable that 
many important uses will be found 
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Clinical Groups 

Most of the technical Rorschach reports deal with maladjusted 
persons. The important uses which have been demonstrated include: 

a. Diagnosis: 

To determine the degree of impairment of mental ability 
To indicate the depth of illness or psychotic malignancy 
To aid in differential diagnosis 
To detect organic ceiebral damage 
To show by retest a tendency toward improvement 

b. Treatment: 

To aid in determining the best course of treatment, and the 
probable success of treatment 

As a therapeutic agent, since the test may give the patient some 
emotional release 

c. Evaluation: 

To aid in evaluating therapeutic programs 
SUiMMARY 

From the foregoing it seems clear that the Rorschach is not a 
technique to yield a clear and adequate measure of specific knowl- 
edge, skills, interests, or attitudes It does not give a sharp profile of 
types of impulses However, it is probably the best single test to show 
the ways one perceives and imagines in accordance with his experi- 
ences and personality. It is a sensitive and complex technique, so 
that unusual training is essential for its administration and inter- 
pretation. It has already yielded valuable practical assistance to 
clinical practice, but its full contribution in analyzing personality is 
yet to be made The close association of many Rorschach workers 
with Kraepelin's clinical syndromes and with Freud's theories has 
resulted in diagnostic uses of the Roischach principally along these 
lines. There is need for systematic analyses of the results of Ror- 
schach examinations to detect independent aspects of behavior and 
to determine the usual relationships of functional patterns. Such 
analyses may well result in more original contributions concerning 
the nature of personality. 

STUDY GUIDE QUESTIONS 

1. What basic assumptions are made in administering and interpreting 
the Rorschach Test> 

2. What types of information are usually secured in the three periods of 
administration? 
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3 What aspects of behavior are related to location determinants, and 
content^ 

4 \\ h\ arc indices or ratios often more significant than laiv scoies’ 

*5 1km wa^ the basic Rorschadi scoiing s\stcm deselopccP 

6. To ulMt c\tc*iiL docs the Basic Roischach Sioie distinginsh cliimal 
gi oups, - 

7 What iiiui j>ictatJor» ol the IVisic Rnisthach Score do the aiiiliors 

8 What IinmaMons are gistn lo Roischacii rcsuhs b\ group adimnib 
rration? Bs iiuilfpIt-choKe (jiicsiions- 

What usuhs did Muuiol find in piedicting success in college from 
Roist hath stou s- 

10 W liat t‘Mdtii<cs of uliabilits aic tlicie loi \aiious l\pes of Roischacli 
adiuinisti.ition and iiitci incratinn^ 

1 1 Hou c.iii die \ ilidiis ol Roisihach rcbiilts he csahiatcd^ 



CHAPTER XXIV 


OBSERVATIONS OF 
BEHAVIOR 


The last two chapters have described ways of evaluating personality 
patterns by means of inventories and other test situations most of 
which appraise a restricted sample of behavior. In this chapter var- 
ious methods of rating or analyzing behavior from direct observation 
and the application of these methods to various situations are dis- 
cussed. 


METHODS OF OBSERVATION 

There appears to be only one basic method of observation, namely, 
to have one or more observers pay close attention to a situation and 
to record, as soon as possible, what they think happened. In a labora- 
tory this method is supplemented by objective records made by ap- 
paratus. There are many varieties of observation of which time 
sampling, record analysis, and sociometry are the most frequently 
used. 

Time Sampling 

Time-sampling methods have been found to be particularly well 
adapted to the study of modes of adjustment. A large number of 
studies in this field are reviewed periodically in the Review of Edu- 
cational Research 

A good illustration is the work of Goodenough (1930), who re- 
ported twenty-five 1-minute observations on thirty-three nursery 
school children. The scores for each child were the sums of the 

G91 
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aniounls of time spent in eiuh aetKitv The odd-even reliability of 
obseivcd time in laughter, ^oc labililv, leacleiship. and ])hysKaI activ- 
ity was appro\imateI\ .80 RellabIlltle^ ol observed tiiiic in com- 
pliance and talkati\encss were aj)]no\iinatcl\ oO and 60 1 lie coi- 
icla lions between sociabilits, leadeishij), and talkaiivciiess w’crc 
»*enei.ill\ abose .60. I.eadeishijj also showed small positnc relation- 
ships with height, weight, chionological age, and mental age, hut 
/CIO iclationships with sex, phssiological tests, and si/c oi status ot 
iamiK 

Vnothergood example of the time-sampling method ol observation 
is icpoited i)\ Anderson Biewcr, and Reed (1016), who studied the 
beha\ior of children and Lcarheis in the fiisl, second, and third 
grades. I hev devised a check sheet to be used in 5-niinutc obsei'va- 
tioiis cjI a -single child The sheet contained riiL)-one code numhcis, 
each ol which iclcired to a taielullv defined situatioii For instance 
(P 23). 

DC domiuatum by the tradin with evidence of conflict The child 
h.is given sonic expression of his goals or desires and the teadrc'i 
behaves in a inannci to stop that behavioi between them , 
eight ivpes iiu* dincrcnti.itcd 
DC I (lvie)7Uinc^ n detail ofactivitv in conflut 
DC 3 relocates (hild Iiectiusc of some distiubance. 

DC A dnect tefusal oi c oniiadiaion, evasion of child’s protest or com- 
plaint, posiponemeni wiihont expicssed leason oi consideration 
DC disnppioval blame oi shame dnected toward child as a jierson 
Incliuli** iC‘|C(ti\i ireh.iMor 

DC 6 Teauiinp tin eats, remindeis condiltoiial pjomises, ohsliuction, 
or niiei juliliou 

DC 7 rolls to attention, because ol slow, indifTcicnt or ncgativistic be- 
h<i\ loi 

DC 11 Imnishnicnt^ sending out of room attack, or clepiiving child ot 
niiiicnal oi acdvitv ^ 

In similar fashion, DN, donrmaiion w'ltli no evidence oi conflict DT, 
dommatiou in working together 1\, integration with no cviclcnce 
of woiking together, and I J , integration with evidence ol W'orking 
together are clefined along with a large niimbei oi other situations 
•such as the clrild’s activity in [noblem solving, m social contacts, and 
111 confoiming to the teacher’s deniaiids 

IJ) taking twentv-four observational records of each child lor 5- 
imnute pcuocls over vaiious par ts ol scveial class, it was possible to 
iccorcl a large numbei of episodes and to establish high leliabilities 
for tcachet behavior, gioiiiJ behavior, and individual pupil behavior 
3 Aplnied Psycholo^' Monogtalfh^, No II 1910, p 23 
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It was found that the teacher who used more integrative direction 
had children with much more spontaneous initiative and social con- 
tributions to others, and with much less whispciing, plaving, and 
noncoiifotniing than the teacher uho used inoic dominating diiec- 
Lion 

Record Anahsis 

The facts iccoidccl in a carefulh prcpaied biogi.iphv are seldom as 
accurate as the diiect obseivations -which aie made in time sanqding, 
but the analysis used in biogiaphical sLuclieb and case* studies mav 
include a svsiemaiic pcisonalitv stucK A good illustiaiion is the le- 
poit of Fiiend and Haggaid (194^) who caieliilh anah/ed t‘\iensi\e 
(ounseling and follow-up iccoids of eighty incp wlio sought \oca- 
tional counseling iiom the Famil) Socict\ ol Gieatci bosion Each 
workers lecoid was studied intensively and latcd on the basia ot the 
judgincnis of at least two carefully chosen and well-tiaincd lateis. 
The difTiciiltics of making unanibiguous latings wcie studied and 
niimcioiis levisions ot items were made which increased the pioba- 
bility oi sec tiling only one meaning lor each item. Lach woikei was 
latcd on a schedule of 173 itenii distributed among seven general 
sections 


Number of 
Items 


I Larly life 29 

II Mat me oi current family life 16 

Til Eailv or beginning jobs 7 

IV. Response lo < ounseling 20 

V PersonaliLv patterns and general work reactions 40 

VI. Reactions to specific vsork conditions 34 

VII. General woik capacities, adjustment, and improvement 27 

'Total 173 


Sections V and VI, items which are related to v\oik mtciest, are re- 
produced in Ulus 226. 

Correlational and other analyses led to important conclusions 
such as 

a People who deviate in mental health tended to recreate their 
early family patterns in then own cuiient lam dies, and to have 
ambivalent oi lluctuaUng leclings in many fields 

h Model ale rivalry with hi others or sisters paiallcled a favoiable 
attitude toward usual job competition, whereas intense livaliy v\as 
linked with a shunning of competition. The boss v\as often identified 
with the qualities ol a paient 
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nxvs 226 R\TT\C SCHEDULE FOR RE\CTIONS TO WORK 


PlRS<IN\Iir> P\IIIR\S VMXINtRM ^\ORR Rl \ri IONS 

73 Sticngtii of goal oiicntatioii and nhiiitv Co make icalisLic, Jong-iangc 
plans * (I) piacLicalls none, {2) some, C\) (onsidciablo 
/-i Ainoiiiu of inidtsL m Jcatiiini> d) piacticalJv none, (2) some, (3) (orisidci- 
able 

7j \inoiiTU of iiiieiesL oi cneig\ dcsoicd lo Icisuie-Lime actnitics (1) prac- 
tualls none, '2) some, /3) conMdeiafdc 

76 \mount of ihouglu about jobs (1) piacticall) none, (2) some, (3) con- 
sidor.ilde 

77 Kigiditv, oi iinwiUmgncss to ih ingc set idcMs* (1) slicing, (2) some, (3) piac- 
tualis none 

78 B'llamc bccucen gue and take (I) poor, ^2) some, (3) good 

\nioiMit of self-di^paiagenunc diiccih e\piessed (1) consideiablc, f2) some, 
i") piaciicalh non*. 

80 Ainouiic of self-dispaiagcincnt more untonstioush oi indiitTlly cxpiesscd 
fl) strong; ,2/ some, i piat tie alls none 

81 r<.ai of failiiie directls expicssed (1) strong, (2) some, (3) piacLicall) none 

82 f eai of faduie iiidiiecth Oi iinconsciousb displa)ed (1) considciable, 
(2) some, (3) piactitalls none 

83 Uiliifigncss to risk disappointment or turndown (1) piarticaliy none, 
(2) some, t") (orisideiable 

Ainoiini of lolciance loi fiiisiration (I) piacticallv none, (2) some, (3) con- 
sidcialdc 

S') Ispcjs of leanion to diniruUs (!) blames self, (2) ficnricd or disoigani/cd 
aciivit>, (3) blames particiilai peisons or gioiips, (4) dc\il-nia\-rare attutide 

86 I iirthei t\pe of leaciion to ddficiili) (1) bl une'> geneial conditions oi bad 
lurk, (2) gets sick, (3) uaius to get even, as by 'chiseling , (1) iiins away 
or drinks 

87 rmthei is pc of icaction to difTitulis (1) bopelcss, “bmp' attiiude, 

(2) feels “kicked aioiind'', (3) continuing on in same clnection, (4) lencwed 
striving in new direction 

88 C oiintciacLion or the ainlitv to peisistenih fight oi struggle — an additional 
leaciion to diniciillv (1) jn irncall) none, (2) some, (3) consideiablc 

S9 Insistence on getting along solely by bis own efloits and making lus own 
decisions (I) consulci.iblc, (2) some, (3) piactually none 

90 Reliance on ‘ puU” oi favoutisin in jolj-getnng (1) tonsiciciablc, (2) some, 

(3) practically none 

91 Reliance on rounsclmg. coiuscs, etc , as magital means foi job-getung or 
getting ahead (1) considerable. (2) some, (3) piaciirallv none 

92 Reaction to work where ‘fiifing" v\iih the l)oss is chief means of getting 
ahead (J) very unfavorable, (2) unfavoi.ible. (3) indifteient, (3**)t am- 
bivalent, (1) favorable. CJ) vtiv fa\ enable 

93 Reaction to W'oik wheie hard work or incUistiy are the chief means of 
getting ahead (for scoiing, sec item 92) 

•A rcahsLically oiicntccl positive stick-to-ifivcne<s in the sense that the goal 
was within the occupational level on which ibc client could opeiaie 
A compulsive quality is suggested in the use of the word rif>idily 
t On the IBM caids, the scale 1, 2, 3, 3^ 4, 5, was repiesentcd respectively, bv 
punches of 1 and 4; I; 2, 2 and 4, 3, 3 and 4, because wc had only four numbei^ 
to punch and needed six categories 
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ILLUS. 226 RATING SCHEDULE FOR REACTIONS TO WORK {ConVd) 

94. Reaction to work where brains, skill, or long-time training are the chief 
means of getting ahead' (for scoring, see item 92) 

93. Reaction to work where aggressn eness or initiative are the chief means 
of getting ahead, (for scoring, see item 92) 

96 Reaction to work where loyalty to the company is the chief means of 
getting ahead* (for scoring, see item 92) 

97. Reversal or tendency to follow patterns in work opposite to eaih family 
situation (1) considerable, (2) some; (3) practicallv none 
98 Repetition or compulsive tendency to re-enact in svork earlier famih situa- 
tions (1) considerable, (2) some, (3) practically none 
99. Job-change rating changed to |ob reqiiiiing less skill, (1) approximately 
same amount of skill, (2) somewhat more skill, (3) a great deal more 
skill 

lOQ Changed categories of work: (1) consideiable, (2) somesvhat; (3) practically 
none 

101 Degrees of skill involved in chief occupation. (1) piactically none; (2) some; 
(3) considerable 

102 Tendency of client to spoil own job chances: (1) strong or repeated; 
(2) some, (3) practically none 

103. Ambivalence about jobs or vocations or earning a living: (1) considerable; 
(2) some, (3) practically none 

104 Mental health or emotional stabilit) (1) psychotic; (2) neurotic; (3) some 
neurotic tendencies or symptoms, (4) normal { 

105. Physical-mental (emotional) rating of change: (no score indicates worse); 
(1) the same, (2) somewhat better, (3) much better 

106 Generalized or free-floating fear (1) considerable, (2) some; (3) prac- 
tically none 

107 Physical health* (1) poor or handicapped, (2) fair; (3) good 

108 Accident-pi oneness. (1) considerable, (2) some; (3) practically none 

109 Amount of delinquency'. (1) consideiable, (2) some, (3) piactically none 

110 Over-all relation of job difliculties to personal difliculties: (1) very close 
relation; (2) some; (3) practically none 

111. Impact of depression too young, (1) continued unemployment; (2) em- 
ployed off and on, (3) poorly paid work; (4) impact not severe 

112 Relief history: (1) considerable; (2) some; (3) practically none 

VI REACTIONS TO SPECIFIC WORK CONDITIONS 

Ratings for this section. ( ) no information, (1) very unfavorable; (2) un- 

favorable, (3) indifferent, (3^) ambivalent, (4) favoiable, (5) very favorable 

113 Reaction to work involving sharp competition 
114. Reaction to work involving little or no competition 

115 Reaction to usual competition 

116 Reaction to work where there are good possibilities of advancement 

117. Reaction to woik where client can receive a good deal of special recogni- 
tion on the job, w^hen his work objectively merits some 

118 Reaction to familial work and/or familiar sin roundings 

119 Reaction to opportunities to gam new experience 

120. Reaction to work where he is "left alone,” not closely supervised 

+ For purposes of correlation, this item was arranged in a continuum as fol- 
lows. deviation in mental health— (1) considerable, (2) some, (3) practically none. 
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ILLUS 22G. R\TING SC’III Dl’I 1* I OR RFVCIIONS TO WORK (Cotifd) 

121 Rc<ictiori to work where pood i>«i\ !•> tlie nia|oi icw.ird 

122 Re.iciion to woik where he l^ sine of fiiiuic sccuiiiy 
12^ Rcjctioii to woik wheic he tan he his own 

121 Rcacticm to voik wheie the boss lakes a special fiiendly intciest 
12'’i Reaction to wink wheie the IiO'^s is doinineeiimj; 

12(5 Reaction to w<iik wheie ihcic is an aimospluMc of fa\oiiiisni 

127 Rcaiiion to work wheie ilieie is an ol)jecci\c atmosphcie 

128 Reaction to woik wheie the coinpain neats him as an indiMcliial, nor a 
ninnliei 

12‘k Reaction to presence of congenial fellow-woikcis 
] Ilk Reaction to woik wluch does not involve contact with otlieis 
ni Reaction lo woik which involves coin icc withsnangeis 

132 Reaction lo lesponsifnhtv foi the peifoimance of Ins woik 

133 Reaction to lesponsibihtv foi sn])ci vising or leading olheis 

131 Reaction to woik which involves the accjinsiiion and use of consideiable 
skill 

n*) Reaction to woik of a vnilc soil 

J3G Reaction to woik coiununding iclativclv high social piestige 

137 Reaction ici piesencc of good phvsical woiking conditions 

138 Reaction to work geaied to Ins abilities 

130 Reaction to woik which appears to him as consCiiictivc oi useful 

1 10 Reaction to unions 

111 Reaction to civil sei vice jobs 

112 Reaction to hav ing a sense of group “belongingness” on ihe loh 

113 Reaction to feeling that perhaps thiongh a union he has a certain giasp 
(at least iindcistanding, if not coutiol) ovei the foiccs which aficcr him in 
the job situation 

141 Reaction to the nccessiiv for a strenuous or c\ciiing exeitioii 
14 j Rcnctiuii to possibiliiv of accidents 

IKi. RcncLion to woik wheie he can move mound and be outside 

(Adapted fioin Friend and Haggaid, 1918, p 21 By pei mission of the 
authors and the c'chioi ot Aftpliid Psychology MonogaaNis) 

c Strong jntcicst in learning went w'lth actually taking courses of 
specialized ti dining 

d Family disruption, thioiigh death, sepai-ation, or placement in 
foster homes, was linked with sensitiv liy to turndowns or to rejection, 
but not to good work ad|iistment 

e Tendency to sabotage himself by spoiling his job chances w^as 
a device b> winch an individual avoided possible job (ailuie, oi 
settled a grudge against a paicnL, or sliow'cd sell-hatred It was closely 
related to several items, siuh as rejection by an antagonism towaid 
father or family, rigidity and uiiicalisLic thinking about ]obs, and 
reliance on pull 

About twenty other findings were reported which were suriiiiianzed 
under tw’o geneial topics 

1. A feeling of good in te-^r elatedness with otheis and a history of 
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strongly knit family life were found among those who make good 
work adjustments, and vice versa. 

2. Strong job satisfactions were often those which compensated for 
early deprivations. Among well-adjusted woikers these drives led to 
accomplishment, but at the other extreme early deprivations were 
used as excuses for failure to make reasonable efforts. 

The significance of such findings needs a good deal of study. It also 
points to the need for moie systematic evaluations by interview or 
inventory of the satisfactions of depiivations of early and later home 
and school life. 

Sociometry 

Sociometry is a name given to a. procedure for sjimpling the atti- 
tudes or behavior of members of a group toward each other. This 
proceduie usually consists of short questionnaires or interviews. For 
small children the questions include such as these: "Who would you 
like to play with? Who do you like to work with? Who do you like 
to have sit near you?” Among adults the questions might deal with 
activities in business, reel cation, civic enterprises, or the home. The 
results are then put m graph form to show the relative acceptance 
of members of the group, or degree of activity in various social or 
economic contacts for various members of the group. A good socio- 
metric chart can be the basis of careful planning for a balance of 
activities within a group. 

Since no standardized procedures or accepted norms have yet come 
to hand, no attempt is made here to give specific illustrations or to 
review the voluminous literature on this topic. But the sociometric 
approach is an important one, because it deals with interpersonal 
relations in a more direct and comprehensive manner than most of 
the other methods SoctoineUy, a monthly magazine, and a series of 
monographs is now published under the editorship of Jacob L. 
Moreno, who has long been the advocate of this technique in this 
country 


USUAL SITUATIONS 

Usual situations are those in which there is little attempt to con- 
trol the environment for the purposes of making an observation. 
Such situations are thought to be the most valid indicators of a 
person's real adjustment because he is presumably not trying to 
influence an observer, but is acting naturally. Situations where con- 
duct is directly measured will be discussed first, then observations of 
conduct, and lastly interview situations. 
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Conduct AFeasures 

Among the most claboiate c|iianiit4iLi\e reports of adjustment bc- 
lia\ioi from the point of \ic\v ol ethics aie those of Haitshoine and 
May (1928-3lb, ^shkli desciibe a 6-)ear project made foi the Institute 
of Social and ReIi£»ious Repeal ch These ^sorkers and their assistants 
summaii/cd cvistiiig methods of appiaising deceit, selt-contiol, siig- 
gestibilit), nioial knowledge, icputation and integration Withgic^it 
ingeiuiit'v tlie> iinpicncd old tests, de\iscd new ones, and applied 
them to manv chffeieiit gioiips of school children to discover typical 
Iicifonnances in \aiious enviioninents The 'tvoik of these men has 
inspiicd man\ othcis to continued leseanh and test lehnement Al- 
though then schedules included many types of appraisal, conduct 
measuies were piomincnf Outlines ot their work aic gneii below to 
show the thoioughness of their investigations and to allow’ an evalua- 
tion of the vanems methods of pTOCcdure 

Conduct mcasuies weie secured b) placing jicrsons in natuial situa- 
tions, and then recoiding the number of tunes a pupil used unethical 
conduct to attain a goal T he test situations discussed immediately 
below weie used to appraise deception 

Measines of Deception ^ Hart^Iwnie and May (1928) 1) Copying 
technique, This appraisal was conducted by having the pupils sit 
together in pairs Each pan w’as then given two foims of multiple- 
choice tests that looked alike Both foims contained the same words, 
but the cliojccs were arranged in difieient ordcis, so that if one jiupil 
copied from the other this might be detected by comparing their 
answ’eis This method w*as found ineffective because of the difficulty 
of distinguishing chance fioin copied errors, and also because it did 
not furnish equal opportunities to all who might have a dcsiie to 
copy. 

2) Self-scoiing technique. In order to make an appraisal of de- 
ception in sclf-scoring, ordinary mental or achievement tests w’ere 
given, collected, and scored without marking the test sheets riicy 
weie then returned to the pupils, w'ho weie asked to score their own 
tests from keys whuh were provided Deception can be mcasincd by 
the amount of dilference between scores seemed in ilic'^e two w'avs 
Tests of this kind, unless tliey have a wide range of difficulry, tend to 
present different motives lor cheating among tJiose with different 
abilities. Tins method pioved to be fairly effective, bur rather ex- 
pensive 

3) Improbable achievement In this situation tests were used 
twice once under supci vision, in orclci tliat tlie pupil's actual 
achievement could be determined, and once without observation, 
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in order that he could raise his score by cheating. A wide variety of 
these tests were used, such as: 

1. Puzzle solution in which the rules of the game may be broken 

2. Weight discrimination in which seven pill boxes have the 
w^eights printed on the bottom 

3 Peeking tests in which mazes or other figures are to be tiaced 
with the eyes closed, and in parlor games, such as “pinning the 
tail on the donkey’* 

4. Tw’o equivalent forms of mental tests in which the answei sheet 
is furnished wdth only one form 

5. Two vocabulary tests, one taken home 

6. Athletic tests (Self-iecouled measures of stiength of grip, lung 
capacity, chinning, and broad jump are used, one test in public, 
one test in private ) 

7. Potato race (Cheating is measured by the number of times a 
child broke the rules by picking up tw^o potatoes ) 

8. Three trial-of-speed tests, such as cancellation of A’s, digit sym- 
bol substitution, and making dots in small squares. 

4) Stealing, This group of tests included two types of appraisal: 

(1) tests in which coins are used in pu/zles or pioblems (the score 
indicates the number of coins not returned to the experimenter) and 

(2) situations in which a storekeeper returns too much change, or in 
which a puise with money is found. 

5) Lying In order to measure lying to escape disapproval, the 
pupils were asked to answer questions concerning their cheating ac- 
tivities on the tests just described Lying to win approval vras evalu- 
ated by a questionnaiie on socially approved activities (Ulus. 
227). 

Deception was inferred by a comparison of the actual (or prob- 
able) and the self-reported achievements. If the self-report was con- 
siderably beyond the limits of probable achievement, the pupil was 
discredited. 

Measures of Cooperation, Hartshome and May {1929, Chap, III), 

1) Self-or-class test. In this situation pupils were required to choose 
between working for the class or for themselves. Cash prizes in spell- 
ing were announced for both class and individual honors, but a child 
was not allowed to participate in both contests 

2) Allotment of prize money. Pupils were asked to vote whether 
the money should be distributed to members of the class, or given to 
the school or some hospital child. 

3) Sharing of equipment. Each pupil was given a box containing 
ten articles: a drinking cup, pencil sharpener, ruler, eraser, pen, pen- 
holder, double pencil, and three other pencils. He was then allowed 
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1LU;S 227 L'VIVG 10 WIN \rPRO\ VL 

This method consists of a scries of rather personal questions There are many 
S[)ccihc acts of conduct which on the whole ha\e riithci \\if]e«^prcad social approval, 
but which at the same Imu* .ire rarely' done T he qi'e*»tions revolve around situa- 
tions of this sort 

The test lb in two forms Each form contains S6 questions 


Xamc 


Cli AniTiDi s Sv 
form One 

Date . 


School. 


Grade. 


\n5wcr the follow in" qiKstuins bv underlining ^ LS or NO If >oiir answei is 
YES, draw a line under If your answer is AO, dr.aw a line under AO 

rioabc answ or every (picstion 


J Did you ever accept the credit nr honor for anything when 
jou knew the cri'dit o» honni belonged to someone clsc^ 

2 Did > ou e\ ei ac r greedily by taking more than your shaic of 

anything'* . . . . 

3 Did >011 c\ er blame .mother for something you had done w hen 

you knew all the Lime it was your fault ■* 

4 Do you usually report the iiumhcr of a car >ou see spccd- 

mg'-* ... 

5 Do you alw ays preserve order w hen the teacher is out of the 

room'-* . . . 

6 Do you report other pupils when you sec cheating^ . . 

7 Did you ever pretend to understand a thing when >ou really 

did not iindeistand it ■* 

8 Have you ever disobeyed any law of your country or rule of 

>0111 scliooP 

9 Do you speak to all the people you are acqiiain ted w iili c\ cn 

tlic ones >ou do not like 

10. Do you usually call the attention of people to the fact that 
you have on new shoes oi a new suit or dress'* . . . 


YES 

XO 

1 

YES 

VO 

2 

YES 

VO 

3 

YES 

AO 

4 

VFS 

AO 

5 

^ LS 

XO 

6 . 

\ES 

XO 

7 

YES 

XO 

S 

YES 

VO 

9 

YES 

VO 

10 


( And 26 more qiie^hon? ) 


(Hartshomc and May 1928, p 98 Bv permission of The Macmillan Co) 


to give whatever lie \Mshccl to make up boxes for cluldicn who had 
no useful or pretty things. 

4) Piovtdmg maleual fot hospital childicn Fach child was given 
a set of four en\cIo]^cs and asked cithci to make oi to hnd ]okes, piir- 
zles, piciuies, and stones tor sick children He ivas also asked whether 
he would do this later, or whethci he would like to help, but would 
not be able to do so. 

5) Records of sennee Teachers made records fioin Derpmhpr tn 
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June on the actual activities of each pupil in the cooperation and 
service projects just described. 

6) Verbal desaiption. This test was suggested by the ancient but 
vivid word sketches of the Greek writer Theophrastus. In order to 
get up-to-date materials, items lepresentmg cooperative or helpful 
acts were listed and arranged in rank order The situations were then 
cast into short sketches of boy and gii 1 behavior, ranked from most 
to least helpful, and given values horn 9 to 0. Illustration 228 shows 
some of these sketches Teachers were asked to match each pupil with 
the approximate portrait A \arietv of this test is the Guess Who 
Test ol Mailer (1932), t\hith int hides a widci saiictv of chaiactei- 

IStJCS 

7) Chech Inl of halts Iwo loims of eighty woids, each desciib- 
iiig coopclau^e acts, ^\eic tipj>Iied lo each (hilcl by .it least two teach- 
ers They also used j-point desciiptisc rating scales ol coopeiation 
and selfishness 

Meosincs of Pei^nirnfc and Inhibilion 1) Story Completion. In 
this test situation an exciting story was read to a class as lai as the 
climax. The ending ol the stor\ was supplied in foiins dilliciilt to 
lead, such as (1) with capital Ictteis run together, (2) with small let- 
ters and capitals iiin together, and (3) w'lth spaces between capital 
letters in the wiong j^laces As the child was instiucted to draw' ver- 
tical lines between words in order to facilitate reading, a scene was 
readily detci mined by the numl^er of w'ords cori'ettly marked off. 

2) Puzzles Both mcch.'inical and verbal pu/zles wcic used The 
scores were the time spent working 

3) Letter counting. Pupils were asked to count the letteis in a 
page of picd type, and the time worked was recorded. 

4) Dish action tests. In tlicse tests, lines of digits w'cr e to be added. 
The digits w'cre ptinred among cuiioiis sets ol pictures and lines 
The results weie compared with noiiual addition score:^ 

5) Inhibition tests These included situations where each child 
was given a small box of candy and asked not to touch it until alter 
a senes ol tests On anothci occasion a small combination safe was 
placed on each desk with instructions not to w'ork at it until after a 
senes of six tests Scores on these tests w’ere the number of times a 
child failed to comply by eating randy or manuDulating the intri- 
guing little safe before the set time. 

6) Ratings by Teacheis Check lists of words describing self- 
control w'ere used together wuth 5-point descriptive scales of eight 
items control of temper, contiol of attention, control of laughter, 
talking too much, telling secrets, impatience, ovenndulgence in 
candy, and control of body movements. 
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ILLl S. *228 VI RB \L CHAR VC I LR SKJ- FCHI^S 


The final set, in order of helpfulness and with the scoie \alues based on the dis- 

tnbution of the pictures by the judges, is presented here 

Q — T T IS sincerely interested m promot jng the happiness and welfare of every- 
one. Ills warm spirit of fneiidhness extends to all, no matter where 
thc> are or what their race, class, or creed may be Even the shghtcst 
need stirs him to some friendly act So stiong is this puqjo-^e that he 
would endure the scorn of his fellows in ordei to help a stiupghiig cause 
or 10 help poisons iii trouble, even though the> might not seem to 
dcsen c belji of any kind. 

7 — F F IS quite sincere m wanting to be of help To him it is not so much a 
m«itter of high purpose as of plain decency When the cause seems 
to be a worthy one, he is very generous and would give a cherished 
pos^icssion or deny himself long-anticipated pleasures even for such 
seemingly remote interests as scholarship funds for older students m 
oth( r lands or for hospitals for adults lie w ould persist in his helpful 
acts even though people thought he w.is veiy foolish to do so. He is 
prompt to oiler his services to anyone obviously in trouble 

5 — N "N does not think much about being helpful as a general thing, but he has 

a kindly nature and liis sympathies arc easily aroused When his 
emotions arc appealed to, he is ready to help almost an>one or any 
cause but would piobably draw the line at helping people he dislikcC 
or who seemed foreign or undeserving He always lends a hand w hen 
the occasion is obvious and, if the appeal is strong enough, would give 
aw a}' Ins owm things or money he hisd saved for himseir m oider to help 
even such objects as a college for an iindeiprmleged section of the 
country oi a hospital for poor people The disapproval of his parents 
or Lcaclicis would not dissuade him 

1 — L L is not hkcly to help anyone at all unless he finds that others are doing 
so especially his best friends He d chip m a little for flowers for a 
classmate or help his elder brother lake care of his small sister if U did 
not taki' too much time from his play or if he didn t have to do too 
much vvork or spend more than a few cents But for the most part he 
doesn't want to help, nor docs he care whethei he is rcganled hy others 
as helpful 

0 — P Here is your thoroughly haid-boilcd youngster The person or cause 
needing help makes no difreiencc at all to P; but if someone would 
call for it, he would lei him take away some object he w'anted to get 
nd of provided he would get some thanks or reward or picstigc Or 
he would leave something he didn’t want to do anynv'ay in order to 
engage in some helpful activity if it were very interesting on other 
grounds and if people would applaud him foi doing so But he docs 
not care two cents about the object or about helpfulness as a duty or 
as the pnipcr thing or as a way of winning favor or grace 

(Hartshome and May, 1929, p 82. By permission of The Macmilldn Co ) 
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Measures of Integration, The correlation of tests of honesty, co- 
operation, persistence, inhibition, moral knowledge, attitudes, and 
intellect tended to be so nearly zero that children in the fifth to 
eighth grades appeared to be loosely organized in these respects. It 
was possible, however, to calculate for each child the consistency 
of his scores. Illustration 229 shows two profiles, one characterized by 
remarkable consistency and the other by great variation. Both have 


ILLUS. 229 PROFILE OF HONESTY SCORES IN 
T\VENTY-ONE SITUATIONS 



Tests 

(Hartshome and May, 1930, p. 290, By permission of the Macmillan Co) 
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the same average SD scoics on tvv'cnty-onc measures of deception The 
Hide's, of integration chosen was the standard deviation of the SD 
scores Tlie boy in Illustraiion 229 was found to Jiave an index of 
,29(5, and the girl an index of 1 114 The reliability of such indices 
was found to be in the neighboihood of .411 for tests ol deception. 
The correlation Ijctween indices and toted honesty scoics was 522, or 
882 when collected loi attenuation 
Similar indices of iiuegiatjon weie calculated on the basis of 
twent\-thiec incasuics ol \anous sorts, including intellect, emo- 
tional stability, culruie, honestv, coopeiation, good citi/enship, 
moral knowletige, opinion, teachers’ lecoids and ratings, and chil- 
dren’s Guess Who tests The aveiage iniciconelation for these 
twentN-thiec nieastiies was .‘^0 for a group of one luindred cases from 
nine diflcient classioonis 1 he sums of the tweiity-rhicc scores cor- 
related .110 with indices ol integration Illustiation 2.‘10 shows the 
correlations between iiitegiation indices ancl vai'ious sub-battencs. 
The conduct score appears to have less relation to the index of in- 

ILLrS 230 CORRLLVTIOiV OI‘ CrXI'RVL INTP.GR VTION WITH 
VVRIOUS MLVSLRrS 


Conduct 

Knowledof 



Cor- 



Cor- 


Ra'iV r 

reckd rf 


Raw r 

reeled rf 

School honesty 

354 

61 

Good citizenship 

■1 

87 

Service total 

179 

38 

Information 

HU 

53 

Inhibition total * 

296 

74 

Opinion A -|- B 

HI 

58 

rersislcnce total 

-066 

- 13 

H 


RlIPlJTAnON 

Ability vm) Stmus 



Cor- 



Cor- 


Raw r 

rreted rf 


Raw r 

reeled rf 

Teachers* marks 



CAVI (sigma) 

120 

20 

Deportment 



Resistance to 



Conduct record 



suggestion 

246 

49 

“Guess Who** 



Emotional stability 

194 

37 

Total reputation 

400 

70 

Self-Functioning 

289 

SI 




Age 

- 041 

- 06 




Sims (socio-economic) 

138 

24 




Burdick (culture) 

184 

33 


* Omitting the Picture Inhibition Test t Corrected for attenuation 


(Hartshornc and May, 1930, p 351. By permission of the Macmillan Co ) 
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tegration than measures of knowledge, opinion, and reputation. In- 
tellect, as shown by the CAVI tests, cultural status, and emotional 
stability, show a correlation below .20 with integrity, and age cor- 
relates nearly zero. There is a marked tendency for those who are 
more aggressive in deception to have lower indices of integrity than 
do those who are more aggressive in service and persistence. 

The interpretation of this material is difficult, since all one has to 
be guided by is a pupiFs relative position in his class on twenty-three 
tests and ratings. It was shown that a pupil might have a low in- 
tegration index and still be rated as well adjusted by a teacher. The 
index of integrity is thus not an indication of consistent individual 
behavior, but rather of consistency of measures of abilities and ad- 
justment among a group of persons The authors conclude that 
motives, interests, and attitudes are dependent upon a variety of 
environmental and natural forces, some of which tend to make per- 
sons alike and some diifferent from one another. Opinions and be- 
havior scores of parents and associates were found to correlate fairly 
well with scores of pupils. 

Records of Observation 

Records of direct observation of persons are among the most valu- 
able indicators of adjustments. Three procedures are commonly 
used. In one a continuous record is kept of all important episodes 
and their stimuli 24 hours a day. This sort of record can only be 
made in a hospital, or camp, or home situation where trained ob- 
servers are available. The second procedure is called time sampling. 
Many scattered periods, usually from 5 seconds to 6 minutes, are used 
for observing a particular person. The third procedure records ob- 
servation in a single inteiview or test period All three of these proce- 
dures will be described. 

One of the most thorough studies of a 24-hour record of a family 
situation was reported by Buhler (1939), who sent trained workers 
into private homes. The work of Newcomb (1929) is an outstand- 
ing example of analysis of detailed daily records made by counselors 
at a summer camp. In order to check the value of the general type 
concept of extroversion-introversion, he selected thirty items usually 
found in extroversion scales. Each item was printed with a 4-point 
scale of specific behavior, as shown in Illustration 231 The specific 
behavior consistency of each boy on each situation was fairly high 
from day to day, mean r equaled .78, but considerable variation took 
place over longer periods. 

From the same items he determined "trait” consistency by group- 
ing together those specific behavior situations which seemed to in- 
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ILJIS 2«H n\TLV BFHWIOR RFCORD 
Numbers inditate how iiian> limes a da> ihc behavior was observed. 

Date Counselor 


X\Mi: or Boy 

John 

Heny 

Pete 

Ed 

Plant 

Did he shos\ confidence in his own tibilities^ 






boasted loudly of greater abihlies thdij he 






had 

2 





spoke confidently of ability he reall> had 
expressed lack of confidence in own abili- 

1 


4 

2 


ties 


6 


1 

1 

hesitated even to tr)- his ability 

Did he take the initiative in organizing 


3 


1 


games ^ 

insisted on doing the oiganization himself 

4 





ga\e constant advice to the leader 

6 

3 



2 

helped to plan but lo>aI to the leader 


1 


1 


let others do all the planning — follower 
Did he submit to criticism or discipline from 


• 

4 

4 _ 

• 

counselors^ 

resisted vlolentl 3 ^ or fought back 





1 

retorted angrily, sarcastically, w ith threats 
show cd resentment by mumbling or sullen- 

2 

- 



2 

ness 

6 

0 

0 



accepted it quietly and in good spirit 




0 



A new record sheet v\as used each day to avoid the possibility that the counselor 
might be intluciiced by previous records 

(After Newcomb, 1929, p 21 By permission of the Bureau of Publications, 
Teachers College, Columbia Univcisity ) 

volve similar actnmes or goals 'rhiis, under volubility weiediistcied 
instances of loud tlueats, loquacity, chattering, boasting, and an- 
nouncing intention An average iiitercorr elation ol 26 was iountl 
among spccilit situations which wcic thus grouped together This 
was not consiileiccl to be evidence of the existence oi a central factor 
Similarly, ten tiam were distmguisJicd and scored encigy output, 
ascendancy toward authoiity, ascendancy tow aid otlier boys, solubil- 
ity, seeking limelight, interest in environment, impetuousriess, social 
forwardness, ease of disti action, and prefcicncc for the group 'I’he 
mean intercorrelation of total ttait scores was only 37, a fact whicli 
led Newcomb to conclude that there was no evidence here lor an 
cxti oversion-introversion type of person 
Ratings of behavior, based on seveial months ol obsers'ation, 
have come to hold an important place m surveys of school popiila- 
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tions. Two types of schedules are illustrated in the widely used form 
prepared by Haggerty, Olson, and Wickman (1930). Schedule A 
asked the rater to indicate the frequency of occurrence on a 4-point 
scale of fifteen types of behavior problems, such as cheating. King, 
temper outbursts, speech difficulties, imaginative lying, sex offenses, 
and truancy. In total scores the mote serious problems, as shown from 
clinical records, were given three times as much weight as the less 
serious Schedule B hns thirty-five items in four divisions. Items in 
the first division rcferied to mtclleciuril .ispccrs of bcha^ioi, in the 
second to pliysual, in tlie thud to social, and in the loiiith to emo- 
tional aspects Each of these items, e\ampl<*s of which are shoivn in 
IlJtis 232, IS in the foim oi a j-point rating scale TIic jiomts were 
assigned by finding the avciage scoies on Schedule -V foi pupils who 
leceivcd a pauiculai latiiig on an item in Schedule B Illustration 
232 show's that the subdisisions ol the item, How intelligent is hc‘^ 
w'eic assigned points on the basis of lack of intelligence "Ihe reason 
for doing this is that the groups having the lowest intelligence rating 
also had the greatest iitimbei of bcha\ioi problems In Item 2, Is he 
abstracted or wide awake? the points do not (oncbpond directly to 
amounts ol this trait, foi it w as found that the most alei t had more be- 
ha\ior difficulties than tw’o less aleit groups 

Schedule B was found to correlate 60 with total scores on Schedule 
A, and a composite score using both schedules con elated 76 with the 
frequency with which children were sent to tlic elemental y school 
principal by teachers or monitois The sell-consistency of teachers 
when lating twice within a short period was found to be .86 using 
Schedule B, and the split-half reliability of single rating w'as .92. 
Ratings of tw'o elementary school teachers commonly coriclated 
.60 wdth each other when rat mg small groups of children 

All unusually thorough schedule designed to appraise causes as 
w'ell as symptoms is the Detroit Scale foi the Diagnosis of Bchavioi 
Problems by Baker and Traphagen (1935) It rcquiies ratings by a 
trained mvcstigatoi on sixty-six items. Each item is rated on a 5- 
point scale* 1 very poor, 2 poor, 3 fair or average, 4 good, and 5 very 
good The ratings are to be based on direct observations, medical 
and school recoicls, and questioning of both parents and children. 
The latiiigs aie desciibed in detail for each item Thus for Item 25, 
Later Recreational Facilities, the pupil is asked, “How do you amuse 
youiself aftci school, or duiing vacation^ What do \ou have to play 
w'lth^” and the parents are asked, “What docs he have to play w'ith 
and how docs he spend his tiiiie^” Six additional questions tor par- 
ents (p 65) are also suggested 

What things does he ha\e to play with^ With what docs he rfitiusc himselP 
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ILLUS. 232 BEHAVIOR RATING SCALE 
Jhrectwnsfor Using 
SCHEDin^E B 

1 Do not consult anyone in making jour judgments 

2 In rating a person on a particular trair, disregard e\cry other trait but th?t one Many 
ratings are rendered valueless brtause the rater allows himself to be influenced by a gen- 
eral favorable or unfavorable impression that be has formed of the person 

3 When you have sati-nied yourself as to the standing ol thi'* person iii the trait on which 
you arc rating him, indicate your rating by placing a cross (X) imniedietely above the 
moi-t appropnatc descriptive phrase 

4 If you are rating a child, try to make your latings by comparing him with cluldrcn of his 
own age 

5 T he m.i'culine pronoun (he) has been used throughout for coiivcnicnre It apphcs 
wticther the person whom vou are raiiiig is riule or female 

6 In making >our ratings, ilisrcgard the numbers which appear below the descriptive 
phrases They are for usc in scor'ng 

Division- I 


1 How intelligent 13 he’ 


1 1 

I'ccble-muidcd Dull 

(5) (4) 

2 1*1 he ab-jlractcd or wide awake’ 

1 

EqiMl of aver- 
age child on 
street 
(3) 

1 

Bright 

(2) 

1 

Brilliant 

(1) 

1 

Continually ab- 

1 

Freciucntly be- 

1 

L'sually 

1 

Wide-awake 

1 

Keenly alive 

sorbed in him- 

comes 

present-minded 


and alert 

self 

abstracted 




(5) 

(4) 

« 

(1) 

(3) 

3 Is his attention isu^taincd’ 




Distracted 

1 

Difiicult to 

1 

Attends adc- 

1 

|c ab-orhed n 

1 

Able to hold at- 

jumps rapidly 

keep at task 

quatcly 

what he docs 

tent ion tor long 

from one thing 

until completed 



periods 

to another 





(5) 

(4) 

(3) 

(1) 

(2) 



Di\ isioN n 



8 Is he slovenly or neat in iicr-onal 

appcari’ncc’ 



\ 

Unkempt, 

1 

Rather negligent 

1 

Inconspicuous 

1 

I« concerned 

1 

I a«tidioiis, 

very slownly 



about dress 

foppish 

(3) 

(4) 

(2) 

(1) 

(3) 

9 How docs he impress jx^ople with his physique and bearing’ 


1 

Repulsive 

1 

Makes an 

1 

Generally uu- 

1 

Makes a favor- 

1 

Excites 


11 n favorable 

noticed physique 

able impression 

admiration 


impres**ion 

and bearing 



(5) 

(4) 

(3) 

f2) 

(1) 


(Haggerty, Olson, and Wickman, 1930 By permission of the World Book Go ) 
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Is there a purpose in his play or is it spasmodic and of no educational 
value? 

Does he play at having shows or at make-beheve schotd? Does he play 
ball, skate, or build model houses? 

Does he have a pet and does he care for it well? 

Does he get any companionship with his parents through sharing play 
and recreation with them^ 

Docs he complain ihat time hangs heaw on his hands' 

Finally, lor rating this lactoi, the lolloping scale (p Of)) is gi\cn' 

Points Factois 

5 Has a few w'cll-organi/cd things to do and desirable coinjimions in 
doing them 

4 Has good altitude but little opportiinitv to pl.is or express himself 
construe I ncly with toinpaiiions 

3 Except Ic^r one or two things or foi short pciiods, does not know 
w'hat to do 

2 Has little to plas with, is alone most of time, has no paiticular pur- 
pose oi drive to «uti\ities 

1 No purpose in play actiMtics, no place to play, playthings in poor 
shape, plays mostly away fiom home, no interest on parents* p.iit 

I'lie sixty-six items aic classified under five headings (1) health 
and physical factors, (2) personal habits and iccreational factors, (3) 
personality and social factors, (4) parental and physical factors of the 
home, and (5) home atmosphcie and school factors A summary sheet 
shows the rating assigned to each item, the niimbei of items which 
were gi\eii each rating, and also a total score obtained by simply add- 
ing all the credits together The total score is then transmuted into 
a letter grade according to a table, which gives tire letters nearly the 
same significance as that found in the usual class in elementary school 
or in the United States Army Alpha tests The follow^iiig sunmiaiy oi a 
case (fiom pp 336-38) will illustrate the procedure. 

CASE 5 Bella vioi Rating C — 

D M , considered as a behavior case, was a fifteen-year-old boy, with a 
score of 218 The distribution of his items is as follows: 

SLMVrAR\ OF FuIORS 

Category Number Weighted Score 


very' poor 5 5 

poor 1 3 26 

fair 22 66 

good 9 36 

very good 17 85 

66 
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He was considered as an extreme behavior case in his locality, which was 
somewhat above the average. The five very poor items were as follows- early 
health. Item 1, seems to have had a considerable influence on his problem, 
as his father particularly w^as very indulgent to him on account of his 
health Item 29, on anger and i.ige, shows that he was inclined to yield to 
flts of angei easily and was fic(iueiuh puked on by other children Item 
vocational iniercsls, seems to be \er\ negative on account of his health, poor 
scholastic record, and too much paiental indulgence Item 18 shows tluit 
the father is an unskilled laboier doing truck driving and janiiniial jobs, 
with some possible leelings of nifeiioiity about it As to Item G5, D M's 
rccoid was consistently poor, with D‘s and E’s, which lellcut in part his 
pool vocational interests 

The Items rated poor are as follows degree of vision defect. Item 7 — 
he alwavs had very poor vision in one eye, but the other eye was approxi- 
mately normal Uis father was rathcT opposed to making any adjustmcius 
to improve Ins vision Item >, accidents — he was hit by an automobile, 
with minor injuries, and two vears before he had bioken his nose in a 
bicvclc accident Item 15 — he was very unattractive m appcaiance because 
of his crossed eves Items IG and 17 — his early care and piesent caie were 
poor Items 19 and 20 — ^lie ale meals <it iricgular times, hurried his eating, 
and was lather fussy and disagreeable about what he ate Item 33 showed 
an IQ of 7() on the Stanforddiinct Test, winch explains some of his dilfa- 
culty in making adjiistmenfs to regular grade standards Item 35, iniLiativc 
and ambition, show'cd him not very ambitious Items 40 and 41, education 
of both parents, showed that they had completed only two ot ilnce grades 
and in Europc.in countries Item 61 — his discipline was inconsistent and 
modified by pity on account of liis physical licalth Item 66, his attitude 
toward the school — in this respect he was poor because the legular school 
work was too diflicult for him The jiarcnts refused to place him in any 
type of special class. 

It seems that in I) M’s case, in the next two or three vears of school 
vocational and social adjustments would be criiual factors m deterinining 
whether he would make a successful final adjustment or whether this com- 
bination of factors would tend to carry him downhill 

Baker and Trapliagen reported correlations between total scores 
and each item on two groups 189 behavior cases, 180 nonbchavior 
cases The total scores on the first group correlated most highly with 
family recreation, 699, ideals of the home, 657; conditions of eat- 
ing, 562, time of sleeping, 555, father’s age at biith of child, 551, 
economic status, .526, child’s intelligence, 525, mothci’s personal- 
ity, .495, and similar items. Among the nonbehavior cases, the total 
scores correlated most highly with scholarship, 663, later recrea- 
tional facilities, 616, personal hygiene, 583, initiative and ambi- 
tion, 569; discipline, 558, mother’s intelligence, .542, conditions of 
eating, 526, father’s intelligence, .512, mother’s personality, 500; 
and eailv self-rare. ..507 
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The following items had low correlations with total scores among 
the behavioi -problem children, childien’s diseases, .03, sue for age, 
.11; speech defects, .13, infectious diseases, 182, early self-care, 17^, 
mothers health, 200, and broken home, 231. \mong the well- 
adjusted group the total scoies showed low coiielations witti gcneiril 
home atmosphere, 117, children's diseases, 05, defee ts of vision, 1 5 1 
si/e foi age, 230, economic status, 220, and child’s intelligence, 275 

1 hese figures illusiiate veiy well the results of con elation analvses 
of difleicnt groups of pupils The diflcrences among pooily adjusted 
pupils seemed to be largely dependent upon tJie inteiplav ol hsing 
conditions and the personality of parents and associates Among the 
wrcll -ad] listed group, total scores coricspondctl to diffeiencos in schol- 
arship, recreations, and personal hygiene. When both g:ou])S of 
pupils W'ere combined, a coirelaiion of 91 was found betw'cen total 
scores and a combination of the fi\e items discipline, child’s atti- 
tude toward the home, parents' altitude toward child, general be- 
havior, and scholarship. 

Interviews 

Employment officers have long relied upon their judgments of an 
applicant's modes of ad|ustment Doubtless, capable interviewers can 
size up fairly well a person’s moods and rnannensms during an inter- 
view', but the brevity of the interview' and the bias of the inter vicw'cr 
may result in fragmentary or erroneous lecoids In Older to help 
interviewers and oial examiners make accurate and systematic ap- 
praisals, the rating form shown in llliistraticm 233 was pi'cparccl 
This form is also typical of many that arc used in ratings of einpiovecs 
who have been in service for a long tune It consists of nine general 
aspects, each ol which is described by a brief paragi.iph and followed 
by a giapluc and descriptive rating scale. 1 his par riculai form, which 
has been made so that it can be scored by a machine, gives a profile 
for each individual. 

Another woithwhile approach, reported by Brody and Powell 
(1947), is called group performance or group interviewing. Here a 
group of from (our to six candidates arc seated around a table and 
asked to discuss iiifoimally a topic ol some interest and complexity. 
Three or moie obseivers lecoid for each candidate the mam activities 
undci the followung six headings (Brody and Powell 1947, p 287 and 
following). 

1. Appeaiarite and Manner poise, physical alertness, nervousness, atlcn- 
tivencss, mannerisms 

2 Speech poAver ol cxpiession, vocabulary', diction, modulation 

3 Attitude Unvaids Group tact, cooperation, ability to mix, flexibility 
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ILLUS. 253 RATING FORM FOR USE OF INTERVIEWERS AND 
ORVL EXAMINERS — 2— 1938 

INSTRUCTIONS : Aslc yourself how this applicant cornices with those 
who are doing work of this kind. Consider whether his voice, appearace, 
etc., would ^ a hability or an asset in such a position. Rate him by 
making a check ( v^) at that point on each scale where, in your judgment, 
the applicant stands. Rate the following traits: 

1. VOIGB AND SPEECH. Is the applicant's voice Irrltatlnf?, or pleasant* Can 
yon easily bear vhat he says* Does he mumble, or talk 'alth an accent which of- 
feada or battles the listener* Or Is hfs speech clear and distinct, his voice so 
rich, resonant and well-modulated that it would be a valuable asset In this position! 


2. APPEARANCE. What sort of first Impression does he make* Does he look 
like a aell>s6t-up, healthy, energetic person* Has he bodily or facial characteristics 
which might seriously hamper him* Is he well-groomed or slovenly? Erect or 
slouchy* Attractive or unattractive In appearance? 


a. ALERTNESS. How readily does he grasp the meaning of a Question? Zs 
he slow to apprehend even the more obvious points, or does he understand quickly, 
even though the idea is new. Involved or dlllicnlt? 


4. ABILITY TO PRESENT IDEAS. Does ha speak logically and cosvIncInglyT Or 
does he tend to be vagne, confused or illogical? 


B. JUDGldBNT. Does ha Impress you am a person whose Judgment would be de- 
pendable even under stress? Or is he hasty, erratic, biased, swayed by his 
feelingsT 


a. EBfonONAL STABILITY. How well poised Is he emotionally? Is be touchy, 
sensitive to criticism, easily upset? la he Irritated or Impatient when things go 
wrong? Or does he keep an even keel* 


T. BELP-OONFIDBNCB. Does he seem to be uncertain of himself, hesitant, lacking 
In assurance, easily bluffed? Or Is ho wholesomely self-confideut and assured? 


flL FRIENDLINESS. Is he a likeable person* Will hfs fellow-workers and subordi- 
nates be drawn to him. or kept at a distance? Does he command personal loyalty 
and devotion? 


#. PERSONAL FITNESS FOR THE POSITION. In the light of all the evidence re- 
garding this person's characteristics (whether mentioned above or not) how do you 
rate his personal suitability for work such as he Is considering* Recalling that it 
is not in his best Interest to recommend him for such a position If he Is better 
suited for something else, would you urge him to undertake this work? Do you 
endorse hla application? 


Fuller instructions and space for comments on applicant’s behavior will 
be found on the back of this sheet. 
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ILLUS 233. RATING FORM FOR USE OF INTERVIEWERS AND 
ORAL EXAMINERS — 2— 1938 (Conrd) 

Applicant’s Name or 

Identification Number Date 

Kind of work for which his 


Buitabfiity is appraised . . 

^ . .... 





1 

1 , . . 1 

• 1 

• I 

*. « * 

IrrJUtIns or 
Indistinct 

Understandable 
but rather 
unpleasant 

Neither ■■ 
conspicuously 
pleasant nor 
unpleasant 

Definitely plensttl 
and distinct 

fiacapcloBsIIy elcer 
and pleasing 

I 

L. • * 

1 


• • ~ ! 

Unprepossessing 

Of Unsuitablo 

Creates rather 
unfavorable 
impression 

Suitable 

Aec^^table 

Creates distinctly 
favorable Impression 

Impressive 

Commands 

admirstiou 

1 

* 


. I 

I 

Slow in gresping 
the oh V ions. Often 
nisunderstends 
mesning of questions 

Slow to understand 
subtle points. 
Requires caplauatloii 

Nearl^r always * 
grasps latent 
of interviewer’a 
questions 

** Rather’qttick” in 
grasping questions 
nnd nsw Idsss 

Exceptionally 
keen and quiek 
to understsad 

1 

Confused Sind 
illogiciU 

I • „ V J 

' Tends’ to seattenr or 
to become involved 

.• r- • * 

Usnnlly gets his” 
ideas aerosa well 

... I 

Shows* superior 
abiU^^t^^preM 

. . J. 

** ’ Unusuail^ logical 
clear end convincing 

1 . j* „ ^ • . 

NoUbly lucking 

In balance nnd 
Tcstmint 

Shows some 
tendency to renet 
Impulsively and 
without restraint 

Acts judiciously in 
ordinary 
clrciunstancet 
Might be hasty 
in emergencies 

Gives reassuring 
evidences of habit 
of considered 
judgment 

. . > 

lnsptrcs*ttnusual 
confidence in 
probable souadncee 
of Judgment 

1 ^ . 
Oversensitive 

EnsUr disconcerted 

1 • . 1 

Occadonnlly 
Impmtient or 
Irritated 

Well poised most of 
the time 

Superior 

self-compuind 

. . 1 

Shows exceptions! 
poise, cebnness snd 
good humor under 
stress 

1 . , _ 

I , * 

. 1 

. J 

. 

Timid Heaitunt 
Easily influenced 

Appears* to ba 
ovcrSeli'Oonsdotts 

Modernity confident 

’’Wboiesomely 

self*confl^vnt 

“* “shows superb 
sclffMutsaee 

1 , 

KeeprpMpl^ It V*" 
dutmnee 

„ * 

■ Does’not’eeiify 
attract friends 

ApprMchiiile 

Eikeable 

. . .1^ 

Draus*inany*£r^ds 
to hua 

_ . . . 

"An i»wlrer*qf 
pecsoasl devotion 
nnd loyalty 

In,— 

Unsuited for this 
work Not endorsed 

» ... *-=4 

Might do'Vell 
Endorsed uitll 
hesitaneo 

^ ’’EadorMd 

Endorsed with 
confidence 

• ** 

EttdonMd'witk 

cathuilssm 


signature of rater 


This rating form prepared from suggestions furnished by W. V. Bingham. 
(By permission of W. V. Bingham and the International Business 
Marhines Corooration.^ 
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4 Leadership *o Jasumc le.id ^\ifhoul giMng oflcnse, acceptance 

b\ group 

5 Contribution to (hoitp Pnfoimanfe teamu<)ikcr or prirna donna, 
awareness ol objective'* ol gioup dti(ii')Sion, abiliiy to icconcilc dilter- 
eiit cs 

6 Scu ntific Appioni h abilitv to marshed data, awareness of irnplicanons, 
abilit) to reason, ingenu its, mental alertness |ii(lj^ment 

Brodv and Pouell pointed out that exact ciuantiiative data arc not 
available loi the cvaliiaiion ol ihc gioup-pcifoTniance Lest However, 
they did set down a luimbei oi apparent achantages Their latioiitde 
in opposition to each point made is aho quoted below it 

1) J'he gioup-pciloiinance test enables the rating cxamineis to observe 
eith tandid.ite in action lor a jieriod of 3 hours In ihe same amount of 
cxaminci-nnu* each apphcaiit could be gianted an individual inU‘ivievs of 
onlv 20 ni mutes. 

But latcrs did not observe each candidate 3 hours Rather, 3 hoius 
of observing time was distributed among the candidates, neressaiil) 
in loiigh 1 elation to such items as the time taken lor speaking by 
each individual, inteicsting physical or behavioial characteristics ol 
candidates, visual and auditor) considerations, and like attention- 
ai resting factors 

2) It permits each examiner to devote full time to observing, listening, 
and taking notes 

Examiners* attention must lag intermittent!) and unprediclably. 
Without the stinnilation of continuing verbal contact between candi- 
date and rarer, the mind of the examiner may wandci increasiiigl) 
as tlie test goes on 

3) It eliminates any tendency on the part of the examiners to use the 
oral interview as a means of impiessing the othei panel members with 
their own knowledge and skdl in questioning and subject mattei 

Any such tendency posits a type of examiner best dealt with by 
eliminating him from the examining process, or at least by ap- 
propriate preliminary briefing Deprived of the opportunity to strut, 
such an examiner will, if allowed to continue, merely find other un- 
desirable outlets for his peculiarities. 

4) It prevents any loss of reliability caused by the use of different ques- 
tions for different candidates as well as by die information given to later 
candidates by those examined earlier 

This assumes that the candidates are so few m number as to be 
expeditiously handled in one group session. Furthermore, in tlie 
oidmaiy oral test, it is quite jDOSsible to arrange to use the same ques- 
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tionS} if desired, for different candidates. In the gi*oup oral-perform- 
ance test Itself, standardization of detail tends to be at a minimum, 
inasmuch as the candidates may roam almost as they will. 

5) It minimizes the effect of the inevitable lack of continuous concen- 
tration on the part of the examiners. 

But it tends to maximize the probability that the examiners* con- 
centration will not be continuous. The fact that the examiner is, in 
a sense, himself under observation in the usual oral test keeps him, at 
least seemingly, wide awake. 

6) It provides a more natural situation than the usual question-and- 
answer contact between a candidate and his examiners. 

It is certainly unnatural for adults to be required to discuss a par- 
ticular matter under the silent and somewhat remote inspection of 
examiners* The usual question-and-answer contact possesses the vir- 
tue of being usual, candidates are accustomed to such a situation and 
are more likely to regard it as natural. 

7) It eliminates the suspicion on the part of any candidate that other 
candidates may be received more kindly and may be given easier questions 
It may even convince him that some other candidates are better qualified 
than he is. 

This IS errant assertion, not evidence. Suspicious candidates could 
in any case imagine partisanship or other bias on the part of the ex- 
aminers. 

.if 

8) It eliminates the following dangers in the conventional oral" aterview 
situation 

a) There exists a tendency for panel members to slant their attention 
to the candidate's response to their own questions. 

h) It has been found that the rating given to any candidate tends to 
carry over positively to the following candidate The consequence is 
a kind of halo effect. 

a) In the conventional interview we can be quite sure that the 
rater listens to the answers to his own questions, if nothing else. A 
similar near-guarantee is absent in the group situation. 

h) Perhaps kindred biases occur in the group test. Where individ- 
ual candidates sit is determined by chance, but where they sit may 
be correlated with their rating. 

9) It provides very valuable information concerning the attitude of 
each candidate toward the other members of the group, as well as of his 
reaction to their attitudes This is particularly important in testing for po- 
sitions where group discussions and conferences are essential. 
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But the situation may be regarded as so unreal as to yield perverted 
findings In the leal conference situation, each mcini)ci generally has 
a defined status. importantly, conlciecs piolDably have prior 

actjuaintancc with one anothei and the good coiderce will use to 
acKantage the iiiloiiiiation which he possesses about the pciaonalities 
of his colleagues in reaching a picdefined objective 

10) It presents '.jiccific evidence concerning the ahilitv ol each candidate 
to he a leader in at gioup 

Moic accurately the gioiip test piovicics specific evidence of the 
candidate’s ahilitv to lead the paiticulai group in whose deliberations 
he j)jriici])aies on a specified subject under the conditions set in the 
test It remains j^ossible that vaijing elements ol the situation may 
lesLiIt in varving the perloiniancc ol candidates. 

11) Those who participate (examiners as well as candidates) find it more 
interesting than lire individual inteniew'. 

Inieu’st is not to be confused with validity and public-relations 
values. Tire latter are the significant aspects ol a test. 

12) Iticcpurcs no skill in asking cjucstioiis on the paitol cxaiiuncrs. 

Here in fact, is one ol the gieatest weaknesses of the gioiip ex- 
amination Opportunity is denied to tire examiner to explore the 
candidates' statements, to follow up leads in oiclci to vciiiy the ap- 
parent competence or incompetence oL the applicant 

Obviously, there is much to be said tor and against the group oral- 
pcifonjj^ap.c test The test is piomising m manv ways, it has decided 
■weakneses Perhaps the best argument which can he advanced at 
the present time in support ol the group test is a negative one — it is 
unlikely to he as bad as the orthodox oial-inteiv tew test 

Ceitainl) additional experimentation hv jjublic personnel agen- 
cies with the group examination is most clesiiable To be sure, this 
is a feeble conclusion and bears icsemblaiice to the traditional posi- 
tion of the liberal whose feet are fnml) planted in mid-air Yet the 
fact ajipeais to be that all that can confidently be said about the 
group oial-peTfoiniance test is that its possible utility is worth ex- 
ploring 

In connection witli then study of sexual behavior in the human 
male, Kinsey, Pomeroy, and Martin (19-18) have listed a number of 
important rules for inter\ucwing This siucly lecjunecl the coopera- 
tion of peisons laigcly unknown to the interviewers in obtaining data 
on from 300 to )00 items in one inters icw which could not last much 
more than 2 hours To be most cllectise they (oiiiicl it svas necessary 
to be introduced to the subject by someone the subject trusted, usu- 
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ally someone who had already been iiuci\ieued Next it was neces- 
sary to pur the subject at ease, to insure absolute ]3ii\acv, and to show 
interest by face-to-face talking without any e\asion Another impor- 
tant aspect ol the intei viewing was the secpiciue of topio Iiuci\ iews 
should progicss from easier to hardci types ol iiilormation, from the 
least distiubmg to the more intimate or possibly distuibing Vaiia- 
tions m the oidei and iri the actual form of cjiiestions, however, dul 
not pci nut any variations in the kinds of question, and weie not 
allow’ed to intcilcic with a s\steniatu coniideiion ol the miciview' 
The authois pointed out that it is neccssaiy to lecogni/e the mental 
capacity of the informant Persons ol various degrees of mental 
ability icqiined diffeicrit rates of speed and differ eni t\j)cs ol cjucs- 
lions They lound that fast, acriiiaic coding ol the mateiial during 
the interview was by far the best method ol lecouhng It rcsulictl 
in much moi*e accurate lecords and also made the sulsjects leel that 
the iriteivievv was impoitant and was being piopeily lecoidcd "Ihey 
stressed the importance of asking matier-of-lact questions to deter- 
mine overt behavior as well as questions which elicited attitudes or 
opinions or feelings 

Carl R Rogers and his students have contributed notably to the 
appraisals ol theiapeutic interviews by making electrical recoidiiigs 
and subjecting these to detailed analysis. They have thus been able 
to determine what kinds of stateinenls tend to lead to acceptance, 
spontaneitv, and insight by the subject They have used these analyses 
to tiain therapists and to predict the outcome ol thciapv, \ good 
example of a check list is that lepoitcd by Porter (1913) (lllus 231), 
which includes twenty-four categories uiidci lour main headings 
(1) defining the interview situation, (2) bunging out and developing 
the problem situation, (3) developing the client’s insight and under- 
standing, (4) sj^onsoring client’s activity/fosteiing decision making 
'1 he vai lous categories vrere used, alter some training, by thirteen 
judges m appraising eighteen reroi'ded interviews 7'he agreement 
between judges was perfect in 31 6 pei cent ol all codings, and the 
correlation between pans of judges in total number ol identifications 
was approximately 95 for both typewritten transenpts and sound 
lecoi dings The ratio ol words spoken by the counselor to those 
spoken by the client was also i*eported Counselors weie found to 
have fairly consistent ratios, and the most talkative counselor had a 
ratio twenty-sev'en times as large as that ol the least talkative. Dif- 
ferences in points of view on counseling were reflected m the pat- 
terns ol the counselor interviews, and the effects of training in non- 
directive counseling w^ere quite clear m these quantitative codings. 
The directive counselors used many more direct questions and sug- 
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ILLUS. 284. CATEGORIES IN THERAPEUTIC COUNSELING 
Code No. Judge Date 

DEFINING THE INTERVIEW SITUATION 

la Defines in terms of diagnostxc/remedial purposes, procedures, etc. 

1& Defines m leims of client responsibility for directing the interview/reach- 
ing decisions, etc. 

1m (Unclassifiable) 

bringing out and developing THr PROBLEM SITUATION 
Uses lead which 

2fl Forces choosing and developing of topic upon client. 

2b Tiulicates topic bul IcavcN dc‘\c1opnu'iit lo choiu 

2f In(li<aiC'» t(»piL and dchiniis dcsdopmcnt lo confirmation, negation, or 
the supp])ing of specific Uciii:> of infonnation 

2 m ( Un class I h able) 

DniiopiNC, liiF ciiPNr’s iNSiGin and undi rsi v.ndinc. 

Respcmcls in such a iva> as to iiuhcaLc 

3m Rccogiiicio 1 ot subject content oi implied subject coiilenr 

Sb Recognition of e\piession of feeling or attitude iii imiucdi.iiely piccccling 
\ei bal response(s) 

3r liiterpieiation or iccognilion of feeling or altitude not CNpicsscd in ini- 
iiiccliatclv picccdiiig \ciba) tcsponso.'s) 

3d Identifies a pioblcm, souicc of difhcultv, condition needing collection, 
etc , through test inteipictation, cvaltuitnc icniaiks, etc 

3e Intcrpreis test lesulis but not as indicating a proldem, soinrc of difficuliy, 
etc 

3/ 1 \piesses appio\«il, disappioxal, shock or othci j^ersonal leaction in le- 
gaid to the client (but not to idcniii) a problem) 

3m (Unclassifiable) 

4 ExpLuns, discusses, or gives infoimanon iclatcd to the jiioblem oi tieat- 
nient 


SPONSORING CLII.Nr AC I IVir^/FOSII RING DfClSION' MAKING 


Proposes client activity 

5a Directly or through questioning letbnique 
In response to quc:)tion of what to do 
Influences the making of a decision by 

5c Maisli.iling and evaluating evidence, expressing pcisonal opinion, pei- 
SLiading pio or con 
3d Indicaics deciSiOn is up to client 
Oe Indicates acccpiance or approv al of decision. 

5/ Reassuies 


I Ii relevant 

Of* Otherwise unclassifiable Total nuinbei of checks 
Nonduective— 1 23150789 10 11— Diiective 


5m (Unclassifiable) 


(Bv peimission of E H Poiier and the editoi of Edurntional and Psychol omcal 
MeasuremeuU) ^ 
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gcstions (Ulus 23^, categories 2a, 2b, 2c, M, 4, and 5fl). The non- 
directive counselors did much less talking than the clients and they 
used categories 3a, 3/i, and 3c almost exclusi\el\. 

An iiileics ting approach to the appiaisal of an inteivicw, reported 
by Chappie (1949), makes use ol an elaboiate recoiding nnuhine 
called an Interaction Chronograph This mathine, which was devel- 
oped fiom seveidl earlier models over a period ol In veais, consists 
of a series ol 10 pi in ting counters and a signal counter, each activated 
by a diffeient key The opeiator obseivcs the interview and recoids 
variations in the activities of both the applicant and interview ei by 
pressing the vaiious keys Ironi the lecoicl (hec Chappie, 1919, p. 
298) one obtains measuics ol 

1. Tempo how often a pcison starts to act 

2 Activity 07 energy how much longer he talks or responds tlian he is 
silent 

3 Adjustment of appluant to interviewer length of his interruptions of 
the inters iev\t*i minus length of Ins failures to respond to the intci- 
viewer 

4 hniwiive the frequency with which one person takes the initiative 
from the other 

5. Dornnirnifc the frequency vuth vxhich one person out-talks or out- 
acis the other when ihcie has been an nuenuption 

6 Synchi onization the frequcncY wnh which a person fails to svnchio- 
ni 7 e with the other either by mteriupting or b) failing to respond 

7 \um her of Exchanges the total nuinbei ol i espouses to the other per- 
son 

Fiom these records several dcriv^ed scores were obtained, such as 
the late per exchange of a variable, ability' to listen, and flexibility 
No dear definition of these tciins has yet come to hand, nor indica- 
tion^ of the iclationships among the measiiie ol results Chappie in- 
dicated that reliability correlations were satisfactory, and stated that 
the observer must be thoioughly' trained in basic concepts The 
criteria weie objective m that a key was moved only when there was 
a visible contraction or relaxation of facial muscles 

Tn Older to secure easily comparable results among interviewers, 
controls ol content and behavior of the interviewer were rigidly 
defined The interviewer used three patterns of behavior during dif- 
ferent periods In one the iiiterviewci intioduccd a topic then en- 
deavored to adjust to the subject by a nondirective type of behavior, 
using such phrase^ as “Thai's iiiteiesting" or “Could you tell me 
more about that>“ After 10 minutes the interviewer changed the 
topic and then ten consecutiv'e times waited at least 15 seconds after 
the applicant had stopped talking If the applicant did not start 
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talking again within 15 seconds, the intci viewer rephrased the ques- 
tion In the third piorcdure the intei\iewci intenupted and tiied 
to talk down the applicant aftci ten coriseciune stairs Flexibility 
IS indicated bv a pei •son’s tendency to leart clillerenily to these situa- 
tions Studies ol certain types ol mental hospital patients showed 
much gi cater rigidity oi inllexibiliiy than was lound in noinial 
peo[)le. 

Chappie compared the scores of simv-mx indusinal line supci \ isors 
with the scoics oL sevciai hundred nonsupci\ isoiy woikeis lie lound 
that the supenisois weie iiioic active, qiiitker in tempo, and sliowed 
moie initial dominance at the licginning ol the inteiview than the 
nonsupciMson woikeis They weie also able to operate at a slower 
pace (llcxibihty) when the second inteniewing profcdtne was uscd 
Chappie noted ddfei cnees in sroies among maintenance ioieinen 
machine-shop foremen, foiindiy foicinen, and super iniendenrs He 
was also able to constriu t a single in lei action index which con elated 
w'ell with general latings of clliciency 

In another study \anous employees of a huge department store 
wcie compared The activity \aiiable alone discriminated w'ell be- 
tween good and poor salesmen, and also between sales and clerical 
workers Ability to listen was imporcant in selling jobs where one 
had to find out a complicated customer specification, but it was a 
handicap in a ia[)id ovcr-the-countei transaction More initiative was 
needed m selling ol high-j>iiced articles. Personnel officers and super- 
visors show’ccl more flexibility and less dominance oi aggression than 
buyers, but both Jiad high activity levels Chappie stiesscd the need 
of developing more acniralc job profiles using Interaction Chrono- 
graph scoics for purposes ol transfers and promotions and for more 
leseaicli m dcfimng and refining the behavior vauables used. 

STRESS SITUATIONS 

In stress situations the subject is aware that he is to some extent 
"on tlic spot," and m some cases the limits of his endurance are 
tested. "Ihcse situations arc described under painful stimuli, reac- 
tions to tests, hand piessures, and lie detectors. 

Painful Stimuli 

A laboratory approach to the measurement of persistence w’as re- 
ported by Howell (1933), w’ho tried to get at an element of person- 
ality which seemed independent of strength or skill He used tasks 
which iiivohed constantly increasing distress Irom one oi several 
sources, such as fatigue from gripping a hand dynamometer, lorcing 
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a needle or blunt peg into the flesh, heat from an electric grill, and 
electric shock. The procedure was to apply the stimulus gradually 
until the student gave the signal to stop, indicating that he had 
reached his limit of endurance Individual tests were made with 102 
subjects who were told that they were competing for rank and that 
their names and ranks would be posted Twenty-four of these stu- 
dents had been previously threatened with dismissal from college for 
academic failures. 

The results of the tests involving strength and piessuie were found 
to correlate .44 with one’s weight; hence a corrected score on these 
tests was made to equalize weight. The reliability of the battery, 
estimated by correlating the odd-even tests, ranged from .19 to .85 
for small samples The intercorrelations of the test ranged from 18 
to .72, median approximately .47. The willingness to endme the 
needle and the willingness to endure heat had low correlation with 
each other and with the painful-pressure situations. The total scores 
on these persistence tests w^ere found to correlate .81 with the voli- 
tional-perseverance section of the Dowmey Will-Temperament Pro- 
file, and .44 with the Allport (1928) Ascendance-Submission Test. 
Persistence scores correlated .18 with ''being a male,” when the in- 
fluence of weight was held constant. The correlation between Ohio 
State University Intelligence Test scores and persistence scores was 
.10, but the latter correlated with grades in college, 44, The predic- 
tion of grades from intelligence tests was .51. This was raised to .64 
when persistence scores were combined with intelligence by a multi- 
ple correlation 

James C. Coleman (1949) reported a careful study of the judg- 
ment of emotions from a facial expression by 379 psychology stu- 
dents. The photographs were made of persons who were subjected 
to the following situations. 

1. Natural sitting quietly. 

2. Effort, squeezing a hand dynamometer as hard as possible with both 
hands. 

3 Startle sudden blast of horn. 

4 Shock, severe electric shock received in cervical region. 

5 Threat: told that shock would be repeated 

6. Disgust for h orror, crushing a live snail with bare fingers 

7 Feai honor' seeing live snake suddenly set free in front of subject 

8. Humor listening to a joke. 

The judges were given printed tests of these situations, and asked to 
indicate which photograph was made during each situation. Four 
groups of subjects viewed the whole face, the mouth only, and the 
eyes only— rotating the order of presentation. The results showed 
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that the identification of the photograph was more accurate fioiii 
the whole fare than irom its parts but for situations 2 and 8 tlie 
judgment ftom the mouth aJone iiearJy as arcuiaic as those lioin 
the \\hole face, and for situations 6 and 7, the jiidgmenis irom the 
c)es alone weie neail) as accurate a-i from the whole face Situation 
8 was the only one where the acciiracv reached approximately 00 
per cent of judgmenu The a\eiage peuentage conect lor all situa- 
tions was approximately 50 Situations 3, 1, 5, and 6 showed the 
low’cst accinacy of judgments — about *10 pei cent roiicct Coleman 
also Mined the pioceduie by having the subject photograplicd while 
acting out the responses consicleieci apjDiopriate for each sitiiatjoii 
The iiatuial and acted sciies of photogiaphs had considciable re- 
sciTihlancc, but the acted senes show’cd in most rases exaggerations 
of the natural reactions Coleman also secured introspect ivc reports 
fiom his subjects These shovsed a considciable vanety of feelings 
toward the expeiimental situations. He concluded that the judgment 
of emotions liom photographs of facial expression under these cir- 
cumstances was in need of much more thorough control and study 

Reactions to Tests 

Elaliorate reports of individual adjustments dining test situations 
have been prejiarecl by a number of authors Illustration 235 shows 
a rating scale by Vernon (1938) which lists a laigc number of items 
Such lists will have to be ^supported by careful definitions and much 
tiaining befoie they will give consistent and significant results. 

Hand Pressures 

An inteicsting laboratory technique, developed by Luiia (1932), 
has been widely applied in Russia and moic recently in Aiiieiica. 
He made two lecoids of hand movements in conjunction with hee- 
association responses to words I'he dominant hand was rcquiied to 
press a bulb to which a piessure recorder was attached The other 
hand, about w^hich nothing was said, w^as allowed to rest on a mov- 
able plate to v\hich a movement recorder was attached Tor each 
verbal res])onse, his results showed the intensity of tw’o kinds of 
motor responses, voluntary and involuntary. From his results, meas- 
ures on a scale of integration weie available. At one extreme ol the 
scale was complete coordination oL speech and motoi responses, and 
at the other extieme involuntary motor responses of considciable 
violence which often preceded oi followed the voluntary hand and 
speech reactions Luna believed that tJic degree of integration de- 
pends upon the cortical mechanism In persons best integrated all 
conflicts weie settled in the cortical centers before motor responses 
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Name 


Date 


Examiner . 


Activity 

Excited, restlesb, unable to keep still 
Quick and vvacious 
Calm and deLi berate 
Inert and listlcvs 

Po‘^es, motor attitudes 

Tics Nail-bitmg 

Fiddling with 

material clothes 

Peculiar 
expressions 

Movlmlnt 
Fluent and graceful 
Accurate and well controlled 
Angular and awkward 
Cluma> 


Impulsis c 
Stable 
Cauciuii-s 
Inhibited 


Twitching-s 

hands feet 

exccss-ive 
.. wrinUiiigs 


Qti'ck •stride and mov ements 
Slow stride and mo\uincnta 


PnVSIQLT AND BCA^TNC 

Impressive in bearing Healthy lookmg, well de- 

Satisfactory impression veiopid, and iiouri'-heti 

Unimpressive Unhealthy, Iceblc physique 

Forceful, eflicient, energetic, upright posture and gait 
Slouching gait 

Weak, incident movements and bearing 

Plump (pyknic) proivirtions Florid 

Well and syinmctncally pioportioncd 

Ihin (aathcuic) Pale 


Person \l -Wearancc wd E\pri sston 
Attractive and good-looking (positive reaction) 
Plea5ant 

Uninteresting, indifferent attractiveness 
Ugly and rcpuUive (negative reaction) 

Stiong cxprc'tsivciicss of face and gesiturcs 
I'Apressionlcss 

Quick and alrong sense of humour 
Slow but sure 
Unable to sec humour 
^lature, serious, philosophical 
Immature, childish 


Sensual 

Ivffcminatc . . . 

I'rank 

Secretive 

Cheerful, optimistic 
Depressed, melancholy 
Excitalile, irritable 
Even-tempered 
Calm, pnlegmaLic 


Special Cn vr \( 1 1 ri&iics . 


SrCTCTT 

\"oicc resonant, plea-'ing, well modulated 
Hard, lurdi, pinched 

I'xprcssca meaning directly, grammatically, with facility 

I nablc to expics** 'umself, ungrammaticaT 

(iarrulous ovcrtalkative 

Rather voluble 

Seldom si^caks of own accord 

Reticent, taciturn 


Clear, fluent, distinct 
Stutters, stammers 

Accent 

Brilliant m talking, wide vo- 
cabulary 

Dull end stohd, narrow vo- 
cabulary 


Personal Care 

Fastidious m dress overmameured 
Good taste, neat and dean 
Passable and inconspicuous 
Careless in dress and cleanliness 
Slovenly and unkempt 
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ILLUS. 235. GENER.\L RATING SCALE (Cont'd) 


SELT-AsSERTtOK 
Pompoiis aiKl oveibeariog 
CompUcent 

‘stlf-c<jnfidcnt tind po>»-C'*<ied 
Sclt-cntical ..iml dcprec.iEor> 
Embirrawd, bashiiil, se]f-conLCiot*s 
Anvujus appreiiLnM\c 
SubmiaM\ c, reuring 


Deasive 

Wavering 

Contra*4iiRReatible 

SiiggLatible 


Co OPiRUlVLVTSS 

Willing to co-operate in everj rc«»pcct, enters into ‘spiiit 
R(NtTved jnd ftirm.il 

Con'.traintd and su-piciou'., outbide the situation 
Riiili and ho'-Lde 

Sfrupulous punctual and regular iii attendance, application 
Indeslriou^ 

«;<)ine mdiiTerent 
Laz> andim'gular 


All KTV' SS \\I) rO\( * NTRATION 
Intell'ucntly atteiiLivu, wide awake 
ConcLiitialcd 
Ab^-ent-inindid 
Eaail> di^iLracled, inattentive 


Tlst Ri tCTioNS Pi wning 
AnabtK.il 

^eiiou^ but unsystematic 

Trial and error Profits by past cKpcnence 

Haphazard Repeats same mistakes 

Emotion* 

W lid and unrestrained emotional behavior and icmarfcs 

Wilful and chiUli-h factions, canricious 

Some los', of »*olf-control, and overt emotion 

Humorous and unconcerned 

Serious, idiiKsophicid 

Repressed and inhibited 


(Vernon, 1938, p 5(> Bv permission of the British Joinnnl of Educational 

Piychology ) 


w’CTC made The opposite was tiiic in the most disoiganucd persons, 
wdieic moLoi lesponses in many patterns were t)piral of conflict 
situations Tlie difhcul*^) of the task could he varied experimentally 
Maikcd tendencies towaicl clisoigani/niion weie found m small chil- 
dren, feeble-minclecl peison^, and liNsiencal paticiiis 

Huston, Shakow, and Eiickson (1921) repealed one of Luna’s cx- 
])eiTmcnts with h\pnoti/cd subjects, and found that under hvpnosis 
the subjects tended to exhibit few motoi disturbances when tlicy 
gave veibal i espouses rclatecl to the conflict When noutially awake, 
the same subjects gave fewci vcibal rcsjDonses, but more moioi re- 
actions Const (juently, the insesrigatois suggested the Inpothesis 
that there are various levels of dischaige of neiwous impulses II ihe 
excitation created b) a stimulus is not chschcUgcd vci bally, it tends to 
spread to voluntary moior responses, and li not discharged there, to 
involuntarv motor responses 
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Olson and Jones (1931) and Sharp (1938) applied the Luria tech- 
nique to find out %vhat kinds of stimulus words cause the greatest 
responses among various types of subjects Stimulus-word lists contain- 
ing fifteen words each were prepared by Shaip to sample responses to 
given regions of activity: family, social, religion, health, and intellect. 
These words w^ere distributed among 125 neutral w*ords, and ad- 
ministered to college and high school groups, scholastic failures, and 
stutterers. The following directions were used (p. 1 14) 

Rest your arms comfortably on the table. Place the first and second fingers 
of each hand on [the rubber which is stretched across the] cups in front of 
you. Now, I shall read a list of words to you. As soon as you hear m\ words 
respond at once with the first word that comes to your mind At the same 
time press your right hand. Your left hand is to remain quietly in the same 
position throughout the c\pei iment 

Each response was scoicd to show the reaction time and the amount 
of voluntais and insoluntar) motor disiinbance as shown by the 
light and the Iclt hands respecri\ely. The test was rcjicated on a 
later day and the correlations were lound to he Jairly high between 
corresponding scores, as recorded m Ulus 236 This faille indicates 

ILLLS 2% CORRFI \T10\S BE'IWFEN riRSr \\1) SECOND 
TRULS Oh AN \SSOCl \ JlCyX MO FOR TEST 


TcU 

Rraclton fune 

Somatic Disturbances 

Total 

94 ± 01 

87 + 02 

Family 

95 + 01 

84+ 03 

Social 

% ± 01 

87 ± 02 

Religious 

.93 ± 01 

88 - : 02 

Health 

90 ± 02 

.67 i 05 

Intellect 

92 ± 01 

85 1 03 

Neutral 

75+04 

08 ± 05 


(Sharp. 1938, p 118 By permission of the University of Iowa, Studies tn Child 

Welfare ) 

a greater consistency betw'ccn trials for ihe veibal reaction times than 
for the motor disturbances, and also for the supposedly affectively 
toned woids than for the neutral woicls All reliahihnes are so high 
that chance variations aie fairly w'ell ruled out. 

The differences among group avciages arc also interesting, as 
shown in Ulus. 237 Heie if appears that average lre:>liinaii girls in 
the university had shorter reaction times than either the scholastic 
failures or the high school girls ^Vords lelated to social situations 
caused the longest reaction times and also the greatest somatic dis- 
turbances among the average college fieshmen, wheieas w'ords re- 
lated to intellectual situations pioduccd these effects among the 
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ILLUS 237. GROUP ME^NS OF REACTION TIME AND 
SOMATIC DISTURBANCES 


Caiegory 

Reaction Time (Sec,) 

Somatic Disturbances 


Group; 1 

2 

3 

Group. 1 

2 3 

Family 

1.79 

199 

210 

2 56 

169 2 73 

Sodal 

214 

214 

2 55 

6 58 

417 4 23 

Religious 

1.60 

2 25 

243 

2 78 

3 25 * 2 73 

Health 

154 

2 07 

2 39 

178 

2 42 3.97 

Intellect 

1.88 

212 

300 

3 62 

2 90 13.73 

Neutral 

157 

195 

2.05 

106 

1,55 100 


Note : Group 1. College Freshman Women 
Group 2, High School Girls 

Group 3. College Women Scholastic Failures and Stutterers 


(After Sharp, 1938, p. 121. By permission of the Umversity of Iowa, Siudm 

in Child Welfare,) 

scholastic failures. The high school girls showed more verbal and 
nonverbal blocking from words concerned with religion and social 
activities. Individual profiles indicated quite clearly the existence of 
particular regions of stress which were m some cases not previously 
diagnosed. 

Lie Detectors 

The use of laboratory methods for the detection of lying has be- 
come fairly common in police groups. Early work by Larson and 
Marsden has grown today into fairly well-established practices which 
often result in confessions, thus avoiding expensive court trials. The 
usual procedure secures records of physiological responses during a 
period of interrogation. If peculiar responses are noted, deception 
is suspected. The responses most often measured are breathing, pulse, 
blood pressure, and electrical skin resistance, sometimes called the 
psychogalvanic reflex Changes m these responses are recorded me- 
chanically after each stimulus question or situation. A good appraisal 
of this technique is given by Larson and Haney (1932) and by Mars- 
den (1948). Inhau (1946) gave a critical discussion of lie detectors and 
estimated that the best methods and technicians probably gave ac- 
curate results in only 75 per cent of the cases. Bitterman and Marcus 
(1947) made respiratory and cardiac tracings with a Keeler Polygraph 
on eighty-one men in a college dormitory where one hundred dol- 
lars had been stolen. They reported that the respiratory tracings did 
not yield a differential classification From the cardiovascular data 
three general classes of persons appeared (a) negligible to all ques- 
tions, (b) significant to all questions, or (c) different reactions to 
relevant and irrelevant questions 
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Reliability and Validity 

The reliabilities reported in various ways are well illustrated by 
Ulus. 2S8. This shows that two similar forms of measures of decep- 
tion gave high correlations in all the tests of cheating by scoring one’s 
own paper. Peeping when the eyes should be shut, faking a solution 
to a puzzle, and getting help from a dictionary were more variable 
activities. Hartshorne and May (1930) reported that separate meas- 


ILLUS 2SS, RELIABILITIES OF TECHNIQUES USED FOR 
MEASURING DECEPTION 


Types of Cottduct 

1. Copying from a key or answer sheet (3 tests) 

2 Copying from a key or answer sheet (duplicating 
technique) (7 tests) 

3. Adding on more scores (6 speed tests) 

4 Peepmg when eyes should be shut (3 co-ordination 

tests) 

5 Faking a solution to a puzale (3 tests) 

(2 tests : Pegs and Fifteen Puzzle) 

6 Faking a score in a physical ability test (4 tests) 

7 Lying to win approval 

8. Getting help from a dictionary or from some person 
on one test done at home 


1* 

2t 

.871 

.863 

.825 


.825 

.887 

.721 

.750 

.750 


.620 


.772 


836 

.240 


* The correlations m this column are based on intercorrelations between siimlar forms, and 
dieted by the Spearman-Brown formula. 

t The correlations m this column are based on retests. 

(Hartshorne and May, 1928, p. 136. By permission of the Macmillan Co.) 


ures of Service showed retest reliabilities near ,90, but that the com- 
bined measures showed odd-even correlations between six sub-bat- 
teries of only .78. The Moral Knowledge Tests showed a high correla- 
tion between two vocabulary forms, .94, but the Opinion tests showed 
less consistency because, in part, of smaller numbers of items and 
shorter times. When items and time were held constant all of these 
tests had high and similar reliability. Opinion Battery A showed 
considerably less consistency than Battery B, probably owing to the 
length of the tests. 

These results emphasize the fact that consistent behavior is more 
evident in large samples of items than in small, because of the elim- 
ination of errors of measurement and of individual variations. Under 
the best conditions self-ratings, conduct measures, and ratings by 
teachers of other pupils are nearly as reliable as are measures of 
skills, and all have high reliabilities when enough items are com- 
bined. 

Definite answers can be given to questions of validity only when 
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some true criterion of truth is available. Since there are no very reli- 
able and accepted criteria for most of the traits under consideration, 
one can only compare various appraisals to see how well they agree. 
A few comparisons are given below. 

Maslow (1937) found from extensive studies of dominance that 
feelings of dominance as shown by self-ratings and observed domi- 
nant behavior were not closely related. Jarvie and Johns (1938) com- 
pared ratings on Bemreuter traits for students intimately known by 
the raters with the actual Bernieuter scores From the low correla- 
tions reported, they concluded that the scores had little value in the 
solution of adjustment difficulties. 

A compaiison by three questionnaires of normal adults and mental 
hospital cases who had been matched for age, schooling, occupations, 
and mental ability was made by Landis et al. (1935) The results 
showed that all three questionnaires, the Bemreuter Personality In- 
ventory, the Page Questionnaire of Schizophrenic Traits, and the 
Woodworth Psychoneurotic Inventory, were not valuable for dis- 
tinguishing normal from insane peisons, although the various types 
of psychosis showed small differences in scores. 

Burnham and Crawford (1935) marked the items on the Bem- 
reuter test 10 times according to a pair of dice. They found that 
chance marks secured in this fashion yielded fairly high neurotic and 
introverted scores. 

Ratings on the Hagerty-Olson-Wickman Scale by teachers were 
compared with the results of simply asking teachers to name the boys 
and girls causing the most trouble in school. Of the lowest 10 per cent 
of the most poorly adjusted pupils as shown by H-O-W scores, only 
about half were named by the teachers. 

In a boys' camp Newstetter (1937) made a check on the validity of 
a preferred-associate ballot by actual observation of free time spent 
with various boys. He found a mean correlation of .73 between the 
two measures, and also reported that the best-liked boys were not 
necessarily those who were cordial, but those who had some skill. 

A valuable comparison of three types of appraisal was made by 
Hanks (1936), He secured six judges who read fifty autobiographies 
of about one thousand words which had been written as part of the 
requirements of a college freshman English class. Each judge wrote 
a short analysis of each case showing the mam types of responses, 
and also estimated the scores which had been made on three tests: 
(1) the Wilke Attitude Scale, (2) the Clark revision of Thurstone's 
Personality Schedule, and (3) the Deutsch Conformity Tests The 
correlations between actual scores and estimates were all low, but 
slightly higher for the second test than for the other two. In most 
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cases the SD of estimated scoies uas smaller ihan the SD of actual 
scoies. In a fuither investi«*ation Hanks instructed six judges to late 
students alter leading their autobiographies, using 28 items which 
conceinecl family lelationships, adjustments in sotial, lehgious and 
economic spheics, and interests. These ratings wcic coinjjaicd to sell- 
ratings on the same items The icsults showed an aveiage ot onlv 
^0 per cent coiiespondencc which was raised to Ih pei cent wdicii the 
same judges were iiistuirtecl in Vcllei’s classification ol t\pes accord- 
ing to \aiiaUons in cooperation and acti\it\ Tins instiuc Lion did 
not lead to the use ol Adlei s types b\ all judges, but it did lead to a 
more logical and thorough appioach to the task 

Haiishoine and Mav (1930) found that tlassinates’ judgments, 
showm by a giicss-who test ol reptiiation loi desirable traits coire- 
latcd .'18 with tc^acheis* judgments as iepiesentc*cl bv a weighted siin: 
ol gi'ades in deportment, a conduct rccoid, and a check list ol traits 
Ihe teachers* check lists showed a conclation oC 81 with the pupils’ 
conduct recoids, and 19 wuih iheir clcpoitmcnt grades 

An jnspcciion ol these reports does noi result in clelinite conclu- 
sions about the \aluc ol the seseial methods We find much c\idence 
that two methods designed to measure the same traits usiialh show 
positive coiielations "I hese become highei as the appraisals become 
iiioic siniilai in piocetline and moie complete m samjjling, and as 
the groujxs ol peisons become larger and more siniilai in eiuiron- 
inent 1 he actual coi relations between scll-iaLings on t|iicstionnanes 
and behavioi recoids aic usually low The coiielations bciw'ccn lat- 
iiigs by teachers, coiinselois, and inters icwcis and bebavioi recoids 
aie olten low 'lire coiielations between conduct rests ancl tiinc- 
samplrng methods are usually inodei'ate Since all ol these ajipiarsals 
have been shown to ha\c high sell-coiisistencv, one must ccjnclude 
that they ai'e evaluating dilfeicnt aspects ol the vaicr, the latee, or 
both A moie analytical appioach is needed to discos ci basic jiat- 
terns of behavior 


SUMMARY 

This chapter has presented a great saiiety of both content and 
procedures, and it should he achuiitcd that the icchnifpies aic Car 
from being svell developed The sariety ol possible acts is such as 
to make ohscivation and iccouliug an cxtiemely complicated affair. 
This usually results in os'eiMmpl dying the ohscisaiion and the scor- 
ing by means ol a simple rating scale or check list These aic olten 
fouiul to have serious limitations sshich are undesirable in the long 
run, although they may give immediate results ol value An enormous 
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amount of research is needed in order to determine more adequately 
^vhat independent \ariablcs are best observed in particular situa- 
tions 


STUDY GUIDE QUESTIONS 

1 \\h«u aie tlic chid .uh.intagc^ and disaiKantages of mcasiiics based 
on obsei\aiion of beluiMoi wlun the subjccis do not know tlie\ aic being 
ob«cr\ccI^ 

2 Wh\ were ^( rbal c Iiarattci '‘ketches found to }ield more reliable results 
tlian Studies based on ratings of single items^ 

^ What are some of the adsantages and disadsantages of sociomctry^ 

4 U hat are the adsantages and disadvantages ol gioiip intei viewing tech- 
niques- 

5 What are the main assumptions made in using a lie detector- 

6 How iiavc labor a tor) controls been used in studying emotional be- 
hav lor- 

7. Suinniari7e Poitcr's categories leLiting to a ilieiapeiitic interview 
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AND INVENTORIES 




Code numbers used in Appendix II are shown in parentheses after some 
of the addresses in this bibliography 

Acorn Publishing Co , Inc., Rockville Centre, N.Y. (IS) 

American Council on Education, 744 Jackson Place, Washington 6, D.C. 
American Foundation for the Blind, Inc, 15 West 16 St , New York 11, N.Y. 
American Hearing Society , (formerly American Society for the Hard of 
Hearing), 817 Fourteenth St , N.W., Washington 5, D C. 

American Psychological Association, Inc., 1515 Massachusetts Ave., N.W., 
Washington 5, D.C. 

California Test Bureau, 5916 Hollywood Blvd , Los Angeles 28, Calif. (1) 
Center for Psychological Service, George Washington University, Washing- 
ton 6, D.C. 

College Book Co , 1836 North High St , Columbus, Ohio. 

College Entrance Examination Board, Box 592, Princeton, N.J 
Columbia University, Teachers College, Bureau of Publications, New York 
27, NY. (11) 

Committee on Diagnostic Reading Tests, Kingscote Apt. 39, 419 West 119 
St., New York 27, N Y (21) 

Cooperative Test Service. Now the Cooperative Test Division, Educational 
Testing Service, Princeton, N J 

Educational Records Bureau, 21 Audubon Ave., New York, N Y. 

Educational Test Bureau Educational Publishers, Inc., 720 Washington 
Ave., S.E., Minneapolis 14, Minn. (3) 

Educational Testing Service, Princeton, N.J (2) 

George Washington University, Washmgton, D.C. (14) 

Graybar Electric Co., 420 Lexington Ave., New York 17, N.Y. 
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Gregory, C A , Company (4) 

Hale, E. M. and Company, Eau Claire, Wis. (15) 

Houghton Mifflin Co., 2 Park St , Boston 7, Mass.; 432 Fourth Ave , New 
York 16, N.Y. (5) 

Institute for Personality and Ability Testing, 313 West Avondale St , Cham- 
paign, 111 

Institute of Li\ ing, 200 Retreat Ave., Hartford 2, Conn. 

Kansas State Teachers College, Emporia, Kansas (16) 

Kentucky Cooperative Testing Ser\ice, University of Kentucky, Lexing- 
ton 29, Ky. 

McGrai\-Hill Book Co., Inc., 333 West 42 St., New York 18, N Y 
McKnight and McKnight, 109-11 West Market St., Bloomington, 111 
The Macmillan Co , 60 Fifth Ave , New York 11, N Y 
Management Ser\ice Co., 3136 North 24 St , Philadelphia, Pa. 

Marietta Apparatus Co , Marietta, Ohio. 

National Office Management Association, 2118 Lincoln-Liberty Bldg, 
Philadelphia 7, Pa (22) 

New York State Department of Education, Albany, N Y. (Regents Examina- 
tions) 

Ohio Scholarship Tests, Ohio State Department of Education, Columbus 15, 
Ohio. 

Ohio State University, Bureau of Publications, Columbus, Ohio. (18) 
Psychological Corporation, 522 Fifth Ave , New York 18, N.Y (7) 

Public School Publishing Co., 509-13 North East St., Bloomington, 111 (8) 

Purdue University, Lafayette, Indiana, Bureau of Publications (19) 
Science Research Associates, Inc , 228 South Wabash Ave , Chicago 4, 111. (9) 
Sheridan Supply Co , P.O. Box 387, Beverly Hills, Calif (20) 

Society for Research on Child Development, National Research Council, 
2101 Constitution Ave , Washington 25, D C. 

Stanford University Press, Stanford, Calif (10) 

State High School Testing Service for Indiana, Division of Educational 
Reference, Purdue University, Lafayette, Ind 
Steck Company, Austin, Texas 

Stoelting, C H , Company, 424 North Homan Ave., Chicago 20, 111 
University of California Press, Berkeley, Calif. 

University of Iowa, Iowa City, Iowa, Bureau of Educational Research. (6) 
University of Minnesota Press, Minneapolis 14, Minn (17) 

Western Reserve University Press, Cleveland, Ohio 
World Book Co., 313 Park Hill Ave., Yonkers 5, N Y. (12) 

Publishers arranged by code numbers: 

(1) California Test Bureau 

(2) Educational Testing Service 

(3) Educational Test Bureau 

(4) C. A. Gregory Company 

(5) Houghton Mifflin Company 

(6) University of Iowa 
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(7) Psychological Corporation 

(8) Public School Publishing Company 

(9) Science Research Associates 

(10) Stanford University Press 

(11) Bureau of Publications, Teachers College, Columbia University 

(12) World Book Company 

(13) Acorn Publishing Company 

(14) George Washington University 

(15) E M Hale and Company 

(16) Kansas State Teachers College 

(17) University of Minnesota 

(18) Ohio State University 

(19) Purdue University 

(20) Sheridan Supply Company 

(21) Committee on Diagnostic Reading Tests 

(22) National Office Management Association 
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Each test title is followed by: 

fl. A Roman numeral which indicates the general type of test 
b. An Arabic number which shows the subject matter or method. 
c A number or numbers in parentheses which show the grade or age 
levels covered by the test. 

d Numbers which represent the publishers listed m Appendix I. 
e. Numbers in italics (where given) which indicate pages in this text. 
The following lists give the classification structure in detail. 

A. Main Types 

I. Achievement 

H. Aptitude 

III. Intelligence 

IV. Interest 

V. Adjustment and Attitude 

B. Subject Matter for Achievement and Aptitude Tests 

I. Applied Science 
2. Arithmetic 

8 Art 

4. Batteries of Tests 

5. Business 

6. English 

7. French 

8. German 

* Adapted from Test Service Notebook No 6, ‘'Organization of a Test Library 
in a School of Education/* by W N. Durost and M E. Allen, by permission of Dr 
Durost and the \V*orld Book Co 
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9. Handwriting 
10 Health 

11. Home Economics 

12. Italian 
13 Language 

14. Latin 

15. Library Skills 

16. Mathematics 

17 Music and Sound 

18. Natural Science 

19. Reading 

20 Social Studies 

21. Spanish 

22. Spelling 

23- Teaching Skills 

24 Vocational Skills 

25. Miscellaneous 

Types of Tests for Intelligence 

1. Group 

2 Individual 

Types of Tests for Adjustment and Attitude 
1, Inventories 
2 Projective Techniques 
C Grade or Age 

1. Below Grade 1, 1-5 years 

2. Grades 1, 2, 3; 6-9 years 

8. Grades 4, 5, 6; 10-12 years 

4. Grades 7, 8, 9, 13-15 years 

5. Grades 10, 11, 12; 16-18 years 

6. College or more; 19 or more years 

D. For index of publishers see Appendix I In the list below only the 
largest publishers are represented by numbers, others are identified 
by name. 

For example, a test title followed by II-5 (4, 5, 6) 12, is an aptitude test in 
the field of business for persons more than twelve years old published by 
the World Book Co. The title gives additional information on test content, 
e g., Turse Shorthand Aptitude Test The list of tests follows: 

I. ACHIEVEMENT 
Applied Science 

Cooperative Pre-Flight Aeronautics Tests 
1 Aerodynamics and Aircraft Structures I-l (5,6)2 

2. Aircraft Engines I-l (5,6)2 

3. Meteorology (see Natural Science) 

4 Navigation I-l (5,6)2 

5 Radio and Communications I-l (5,6)2 
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Applied Science ConVd 

Indiana Farm Shop Tools: Recognition and Use 1-1(4,5,6)19 

Indiana Mechanical Drawing Test 1-1(5)19 

Purdue Test for Electricians, Grades 9--16 and adults I-l (4, 5, 6)9 

Purdue Test for Machinists and Machine Operators I-l(4,5,6)9 

USAFI in Advanced Electionics I-l (6)2 

USAFI in Diesel Engineering CDEG I-l (6)2 

USAFI in Electricity and Magnetism I-l (6)2 

USAFI in Electron Tubes and Ciicuits 1-1 (6)2 

USAFI m Engineering Drawing I-l (6)2 

USAFI in Engineering Electronics I-l (6)2 

US \FI m Engineering Mechanics I-l (6)2 

USAFI in Fluid Mechanics M (6)2 

USAFI in Machine Design I-l (6)2 

USAFI in Mechanical Drawing I-l(j)2 

USAFI in Radio Engineering, I I-l (6)2 

USAFI in Radio Engineering, II I-l (6)2 

USAFI in Strength of Materials I-l (6)2 

USAFI in Surv’cymg I- 1 (6)2 

Arithmetic 

Analytical Scales of Attainment, Arithmetic A1 I-2(3)3 
Analytical Scales of Attainment, Arithmetic A2 I-2(3)3 
Analytical Scales of Attainment, Arithmetic AS I-2(4)3 
Arithmetical Reasoning Test, Form B 1-2 (5, 6)9 

Basic Skills in Arithmetic Test, Form B 1-2(4, 5)9 
Brueckner Diagnostic Test in Decimals I-2(3,4)3 
Brueckner Diagnostic Test in Fractions I-2(3,4)3 
Clapp-Young Arithmetic Test, Form A I-2(3,4)5 
Clapp-Young Arithmetic Test, Form B I-2(3,4)5 
Commercial Arithmetic Test (Indiana) 1-2(4,5)19 
Cooperative Commercial Arithmetic Test I-2(5)2 
Iowa Every-Pupil Tests of Basic Skills 
Arithmetic Skills, Advanced, Test D I-2(3,4)5 
Arithmetic Skills, Elementary, Test D 1-2(2, 3)6 

Lee-Clark Arithmetic Fundamentals Survey Test 1-2(4, 5)1 

Los Angeles Diagnostic TestsTundamentals I-2(2,3,4)l 
Los Angeles Diagnostic Tests. Reasoning 1-2(2, 3, 4)1 

Los Angeles Diagnostic TestsrSigns, Symbols, and Vocabulary of Arith- 
metic 1-2(2, 3,4)1 

Metropolitan Achievement Tests 
Advanced Arithmetic 1-2(4)12 
Elementary Arithmetic 1-2(2,3)12 
Intermediate Arithmetic 1-2(3)12 
Number Fact Check Sheet I-2(3,4)l 
Otis Arithmetic Reasoning Test, A 1-2(3,4)5-12 
Otis Arithmetic Reasoning Test, B 1-2(3,4)5-12 
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Arithmetic Confd 

Progressive Arithmetic Tests, Elementary, A 1-2(8) 1 187 

Progressive Arithmetic Tests, Elementary, Am I-2(8)l 
Progressive Arithmetic Tests, Intermediate 1-2(4) I 
Progressive Arithmetic Tests, Intermediate, Am 1-2(4) 1 
Progressive Arithmetic Tests. Primary A 1-2(2) 1 
Public School Achievement Tests 
Arithmetical Computation I-2(2,8,4)8 
Public School Achievement Tests 
Arithmetical Reasoning 1-2(2, 3, 4)8 

Schorling-Clark-Potter Hundred-Problem Test 1-2(4,5)12 
Stanford Achievement Tests 
Advanced Arithmetic 1-2 (4) 1 2 

Advanced Arithmetic, machine scoring 1-2(4)12 
Intermediate Arithmetic 1-2(3)12 
Intermediate Arithmetic, machine scoring 1-2(3)12 
Primary Arithmetic 1-2(2)12 
USAFI Advanced Arithmetic I-2(5)2 
USAFI Business Arithmetic I-2(5)2 
Wisconsin Inventory Tests m Arithmetic 1-2(2, 3, 4)8 
Woody-McCall Mixed Fundamentals 1-2(2,3,4)11 

Art 

Fundamental Abilities of Visual Art, Lewerenz 1-3(3, 4, 5,6)1 323 

Graves Design Judgment Test 1-3(4, 5,6)7 309 

Horn Art Aptitude Inventory Preliminary 1-3(4, 5, 6) 321 

Kline-Cary Measuring Scale for Freehand Drawing 1-3(3, 4,5) 319 

Knauber Art Ability Test 1-3 (4, 5, 6)8 323 

McAdory Art Test 1-3(2,3,4,5,6)11 33, 303, 307 

Meier Art Tests — ^Art Judgment 1-3(4, 5,6)6 303, 307, 323 

Tests in Fundamental Abilities of Visual Art 1-3(2, 3,4,5) 1 

Varnum Selective Art Aptitude Test I-3(4,5,6) 

Batteries of Tests 

American School Achievement Tests, Advanced 1-4(2, 3)8 

American School Achievement Tests, Intermediate 1-4(3 )8 

American School Achievement Tests, Primary I I-4(2)8 
American School Achievement Tests, Primary II I-4(2)8 
Comprehensive Curriculum Test, Form 1 1-4(4, 5)1 1 

Comprehensive Curriculum Test, Form 2 1-4(4,5)11 

Cooperative Contemporary Affairs for College Students I-4(6)2 
Cooperative General Achievement Test I-4(5,6)2 
Cooperative General Culture Test I-4(6)2 

Cooperative Test on Recent Social and Science Development 1-4(5)^ 
Coordinated Scales of Attainment, Battery 3 I-4(2)3 

Coordinated Scales of Attainment, Battery 8 I-4(4)3 
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Batteries of Tests Confd 

Every-Pupil Primary Achievement Test 1-4(2)16 
Graduate Record Examinations, College Seniors and Graduates I-4(6)2 
205^209 

Iowa Educational Development, Grades 8-13 14(4,5)9 157, 158 

Iowa High School Content Examination 1-4(5, 6)6 194, 195 

Jastak Wide Range Achievement Test 14(2,3,4,5)7 
Metropolitan Achie\ement Tests 158 
Advanced Battery, Complete 14(4)12 
Advanced Battery, Partial 1-4(4)12 
Elementary Battery 1-4(2,3)12 
Intermediate Battery, Complete 14(3,4)12 
Intermediate Battery, Partial 1-4(3)12 
Primary I Battery 14(2)12 
Primary II Battery 1-4(2)12 

Modern School Adiievement Tests Skills Edition, I 14(2,3,4)11 
Modem School Achievement Tests* Skills Edition, II 1-4(2,3,4)11 
Myers-Ruch High School Progress Test 1-4(5)12 
Ohio General Scholarship Test for High School Seniors 1-4(6)18 
Progressive Achievement Tests 170, 177 
Advanced Battery I-4(4,5,6)l 
Elementary Battery 14(3)1 
Intermediate Battery 1-4(4) 1 

Primary Battery 1-4(2) I 

Progressive Tests in Social and Related Sciences 170 
Social Studies, Part 1, Elementary Battery, A 1-4(3,4)1 
Social Studies, Part 1, Elementary Battery. B 1-4(3 ,4)1 

Social Studies, Part 2, Elementary Battery, A 1-4(3, 4)1 

Public School Achievement Tests'Battery A 14(2,3,4)8 
Public School Achievement Tests.Battery B 14(3,4)8 
Public School Achievement Tests.Battery C I-4(3,4)8 
Public School Attainment Tests for High School Entrance I-4(4,5)8 
Public School Correlated Attainment Scales 14(3,4)8 
Sones-Harry High School Achievement Test, A 14(5,6)12 
Standard Graduation Examination 14(4)12 
Stanford Achievement Tests 377 
Advanced Battery, Complete 14(4)12 
Advanced Battery, Partial 14(4) 1 2 

Intermediate Battery, Complete 14(3)12 
Intermediate Battery, Partial 14(3)12 
Primary Battery 1-4(2)12 
Unit Scales of Attainment 
Form A, Division 1 14(3)3 

Form A, Division 3 14(4)3 

Form B, Division 1 14(3)3 

Form C, Division I 14(3)3 
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Batteries of Tests Confd 

Form C, Division 2 14(3)3 

Form C, Division 3 14(4)3 

Minimum Essential Battery, Form A, Division 
Minimum Essential Battery, Form A, Di\ision 
Minimum Essential Battery, Form A, Division 
Minimum Essential Battery, Form B, Division 
Minimum Essential Batter)% Form B, Division 
Minimum Essential Battery, Form C, Division 
Minimum Essential Battery, Form C, Division 
Minimum Essential Battery, Form C, Division 
Primary Division, Form A, Grade 1, first half 
Primary Division, Form A, Grade 1, last half 
Primary Division, Form A, Grade 2, first half 
Primary Division, Form A, Grade 3 14(2)3 

Primary Division, Form B, Grade 1, fiist half 
Primary Division, Form B, Grade 2, last half 
Primary Division, Form B, Grade 3 14(2)3 

Primary Division, Form C, Grade 1, first half 
Primary Division, Form C, Grade 1, last half 
Primary Division, Form C, Grade 2, last half 
Primary Division, Form C, Grade 3 14(2)3 

Business 


1 14(3)3 

2 14(3)3 

3 14(4)3 

2 14(3)3 

3 14(4)3 

1 14(3)3 

2 14(3)3 

3 14(4)3 
14(2)3 
14(1)3 
14(2)3 

14(1)3 

14(2)3 

14(1)3 

14(1)3 

14(2)3 


Blackstone Stenographic Proficiency Tests, Stenography 1-5(6)12 
Typewriting 1-5(6) 1 2 

Bookkeeping Test (Indiana High School Tests) 1-5(5)19 
Breidenbaugh Bookkeeping Tests, Test 1 1-5 (5)8 

Breidenbaugh Bookkeeping Tests, Test 2 I-5(5)8 

Breidenbaugh Bookkeeping Tests, Test 3 I-5(5)8 

Breidenbaugh Bookkeeping Tests, Test 4 1-5 (5)8 

Clerical Perception Test 1-5(4, 5)3 

Elwell-Fowlkes Bookkeeping Test 1-5(5)12 
General Test of Business Information 1-5(4,5,6)16 
Hiett Stenography Test (Gregg), Test I 1-5(5)16 
Hiett Stenography Test (Gregg), Test II 1-5(5)16 
Kauzer Typewriting T ests, T est I 1-5(5) 1 6 

Kauzer Typewriting Tests, Test II 1-5(5)12 
Kimberly-Clark Typing Ability Analysis 1-5 (5)9 
NOMA Bookkeeping Test 1-5(5)22 

NOMA Business Fundamentals and General Information 1-5(4,5)22 

NOMA Filing Test 1-5(4,5)22 

NOMA Machine Calculation Test 1-5(5)22 

Parke Commercial Law Test 1-5(5,6)16 

Parke Commercial Law Test 1-5 (5)6 

Shemwell-Whitcraft Bookkeeping Test — ^I 1-5(5)16 
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Business Confd 

Shemwell-Wh itcraft Bookkeeping T est — 1-5(5) 1 6 

Shorthand Test (Indiana High School Tests) 1-5(5)19 

SRA Dictation Skills— -Accuracy, Speed 1-5(5, 6)9 

SRA Language Skills I-5(5,6)9 

SRA Typing Skills 1-5(5, 6)9 

Thompson Business Practice Test 1-5(4,5)12 

Thurstone Employment Tests Clerical, Typing 1-5(6)12 201 j 204 

Turse-Durost Shorthand Achievement Tests (Giegg) 1-5(4,5)12 

Typewriting Test (Indiana High School Tests) 1-5(5)19 

USAFI Bookkeeping and Accounting, 1 I-5(5)7 

USA FI Bookkeeping and Accounting, 2 I-5(5)7 

USAFI Commercial Correspondence I-5(6)7 

USAFI Gregg Shorthand I-5(5)7 

USAFI Typewriting I-5(5)7 

English 

Barrett-Ryan Literature Test 1-6(5,6)16 

Barrett-Ryan-Sthrammel English Test 1-6(5,6)12 

Carroll Prose Appreciation Test, College I-6(6)3 

Carroll Prose Appreciation Test, Junior High I-6(4)3 

Carroll Prose Appreciation Test, Senior High I-6(5)S 

Clapp-Young English Test, Form A I-6(3,4,5)5 

Columbia Research Bureau English Test 1-6(5,6)12 

Columbia Vocabulary Test I-6(2,3,4,5)7 

Cooperative English Test, Form O I-6(5,6)2 

Cooperative English Test, Form OM 1-6(5, 6)2 

Cooperative English Test, Higher Level Single Book I-6(5)2 

Cooperative English Test, Lower Level, Single Book I-6(5)2 

Cooperative Literary Acquaintance Test 1-6(5, 6)2 498, 499 

Cooperative Literary Comprehension and Appreciation I-6(5,6)2 499 

Cooperative Vocabulary Test I-6(4,5,6)2 

Coopera ti\e Vocabulary Test, Short Form I-6(4,5,6)2 

Cross English Test I-6(5,6) 1 2 

Davis-Roahen-Schrammel American Literature Test 1-6(4,5,6)16 

Davis-Schrammel Elementary English Test 1-6(3,4)16 

Essentials of English Tests, Form A I-6(4,5,6)3 

Essentials of English Tests, Form B I-6(4,5,6)5 

Essentials of English Tests, Form C 1-6(4, 5,6)3 

Hudelsohn English Composition Scale 1-6(3,4,5)12 

Iowa Grammar Information Test, Form A I-6(4,5)6 

Iowa Grammar Information Test, Form B 1-6(4, 5)6 

Kennon Test of Literary Vocabulary, Form I 1-6(6)11 

Kennon Test of Literary Vocabulary, Form II 1-6(6)11 

Kirby Grammar Test, Form I 1-6(4)16 

Kirby Grammar Test, Form II I-6(4)6 
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English Confd 

Leonard Diagnostic Test in Punctuation and Capitals 1-6(3,4,5)12 

Mechanics of Written English (Indiana High School Tests) 1-6(4,5)19 

Nelson's High School English Test, Form A 1-6(4, 5)5 

Nelson's High School English Test, Form B 1-6(4, 5)5 

P-L-S Journalism Test 1-6(5,6)16 

Pressey English Tests, Grades 5 to 8 1-6(3, 4)8 

Public School Achievement Tests Grammar I-6(3,4)8 

Purdue Placement Test in English, Form A I-6(5,6)5 

Purdue Placement Test in English, Form B I-6(5,6)5 

Purdue Placement Test m English, Form C I-6(5,6)5 

Reading Scales in Literature, Form A I-6(4,5)3 

Reading Scales in Literature, Form C 1-6(4, 5)3 

Rigg Poetry Judgment T est I-6(4,5,6)6 502 

Rinsland-Beck Natural Test of English Usage, I 1-6(4, 5,6)8 

Rmsland-Beck Natural Test of English Usage, II 1-6(4, 5, 6)8 

Rinsland-Beck Natural Test of English Usage, III I-6(4,5,6)8 

Shepherd English Test I-6(4,5.6)6 

Stanford Achievement Test 1-6(3,4)12 

Stanford Tests m Comprehension of Literature 1-6(4,5)10 

Tools of Written English (Indiana High School Tests) 1-6(4)19 

Ullman-Clark Test on Classical References I-6(5,6)6 

USAFI Business English 1-6 (5)2 

USAFI English, Form CEn-2 I-6(6)2 

USAFI English, Form CEn-S I-6(6)2 

USAFI English 

Book LReading and Interpretation of Literature I-6(5)2 
Book 2: Composition I-6(5)2 

USAFI General Educational Development 
Test IrCorrectness and Effectiveness of Expression I-6(5)2 
Test l.(5ame as above but for college level) I-6(6)2 
Test 4Tnterpretation of Literary Materials I-6(5)2 
Test 4 (same as above but for college level) I-6(6)2 
Van Wagenen Analytical Scales of Attainment I-6(6)8 163 

Van Wagenen English Composition Scales 1-5(4,5)12 163 

Wide Range Vocabulary Test I-6(2,3,4,5,6)7 

Fiench 

American Council Alpha French Test 1-7(5,6)12 200 

American Council Beta French Test 1-7(5,6)12 
American Council French Grammar Test 1-7(5,6)12 
Columbia Research Bureau French Test 1-7(5,6)12 
Cooperative French Test, Advanced Form P I-7(5,6)2 
Cooperative French Test, Elementary Form Q 1-7 (5,6)2 
Cooperative French Test, Higher Level I-7(5,6)2 
French Reading Test I-7(6)2 
Miller-Davis French Tests 1-7(5,6)16 
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French Cont’d 

Silent Reading Test in French J-7(5,6)] 

USAFI French Grammar, T owtT Le\(’I I-7(o)2 
US VFI French Grainniai, Uppei Le\el l-7(fi)2 
USVFI French Reading Gompiehension, LoT\cr F7(j)2 
USA.fi I’rcnch Reading C ompreheiision, Uppei l-7(6)2 
USAFI I'lcnch V()cabular\ l-7(5)2 

USAH French VocabuLny I-7(b)2 

German 

American Council Mpha Geimaii lest 1-8(5,6)12 
Columbia Research Kureau Oeiman Fe^t 1-8(5,6)12 
Cloopcrative German Test, Advanced I-8(5,6)2 
Cooperative German Test, Elemcntaiy 1-8(5, 6)2 

German Reading 'lest I-8(6)2 
US \FI German Grammai I-8(5)2 
USM'l German Reading Coniprehcnsion l-8(5)2 
USAFI Gciinan Vocabulaiy I-8(})2 

llandwuimg 

Camard Manuscript W’riting Standards — Pen Foims 1-9(2,3,4,5,6)11 
Coiiard Maniisciipt W’liting Standards — Pencil Foiins 1-0(2 3)11 
Courtis Standard Practice lests in Handwriting 1-9(2,3,4)12 
Thorndike Ilandwriiiiig Scale 1-0(2,3,4)11 161 

Health 

Breivcr-Schrainmel Test of Health Knowledge 1-10(3,4)12 
GatoS'Strang Health Knowledge Test, Torm C 1-10(2,3,4)11 
Cates-Stiang Hcsilth knowledge lest. Form E 1-10(4, ■))! 1 
Health and Safety Education Test flndiana Tests) 1-10(4,5)19 
Health Inveiitoiy for High School Students 1-10(4,5)1 
Health Practice Inventory 1-10(4,5,6)10 
MacDonald Physical Examination Record 1-10(2,3,15,6)12 
Public School Achievement Tests Health 1-10(3,4)8 
Truslei-Ariiett Health Knowledge Test 1-10(4,5,6)16 

Home Economics 

Assisting with Care and Play of Children 1-11(4)19 
Assisting with Clothing Problems 1-11(4)19 
Child Development 1-11(4,5)19 
Clothing I and Clothing 11 1-11(4,5)10 

Cooperative Test in Foods and Nutrition I-l 1(6)2 
Cooperative Test in Textiles and Clothing 1-11(6)2 
Eiigic-Stenquist Home Economics Test 

Clothing and T extilcs, Foims A and B 1-1 1(3,4,5)12 
Foods and Cookeiy, Forms A and B I-l 1(3,4,5) 12 
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Home Economics Cont*d 

Household Management, Forms A and B 1-11(4,5)12 
Foods I Food Selection and Preparation 1-11(4,5)19 
Foods II: Planning for Family Food Needs 1-11(4,5)19 
Helping with Food in the Home I-l 1(4)19 
Helping with the Housekeeping 1-11(4)19 
Home Care of the Sick 1-11(4,5)19 
Housing the Family I-l 1 (4,5) 19 

Information Tests on Foocls'Ilhnois Food Test 1-11(5)8 
Unit Scales of Attainment in Foods and Household Management, Forms 
AandB 1-11(4)3 

Italian 

Cooperative Italian Test 1-12(5,6)2 

USAFI Italian Grammar 1-12(5,6)2 

USAFI Italian Reading Comprehension 1-12(5,6)2 
USAFI Italian Vocabulaiy 1-12(5,6)2 

Language 

Iowa Every-Pupil Tests of Basic Skills 
Basic Language Skills, Advanced, Test C 1-13(3,4)5 
Basic Language Skills, Elementary, Test C 1-13(2,3)5 
Iowa Language Abilities Test, Elemental^ 1-13(3,4)12 
Iowa Language Abilities Test, Intermediate 1-13(4)12 
Iowa Primary Language Test 1-13(2)6 
Language Essentials Test, Forms A and B 1-13(3,4)3 
Los Angeles Diagnostic Tests: Language 1-13(2,3,4)1 
Progressive Language Tests 
Advanced, Form A 1-13(3,4,5)1 
Advanced, Form A, machine scoring 1-13(5,6)1 
Elementary, Form A 1-13(3)1 
Elementary, Form A, machine scoring 1-13(3)1 
Intermediate, Form A 1-13(4)1 
Intermediate, Form A, machine scoring I-l 3(4) 1 

Primary, Form A 1-13(2)1 

Public School Achievement Tests Language 1-13(2,3,4)8 
Stanford Achievement Tests: Advanced Language Arts 1-13(4)12 
Stanford Achievement Tests Intermediate Language Arts 1-13(3)12 
Wilson Language Error Test 1-13(3,4,5)12 

Latin 

Cicero Test 1-14(5)16 
Cooperative Latin Test, Advanced 1-14(5,6)2 
Cooperative Latin Test, Elementary 1-14(5,6)2 
Cooperative Latin Test, Higher Level 1-14(5,6)2 
Cooperative Latin Test, Lower Level 1-14(5,6)2 
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Latin ConVd 

Godsey Latin Composition Test 1-14(4,5)12 
Holtz Vergil Test 1-14(5)16 
Hutchinson Latin Grammar Scale I-l 1(5)8 
Kansas l‘ir^t-\cai and Setond-\ear Latin Tests 1-14(5)16 
Power’s Oiagnosiit Latin lest 1-11(1,5)8 
Ullman-Kirl)\ 1 atm C oinpiehcnsion 'lesi, 1 M1(>,6)1 

Ullinan-kirln Latin Camipichcnsion 'Icsl, H 1-14(3, (>)1 1 

White Latin 'Test Ml(5,(>)12 

Libimy SkiiU 

Libiaiv 1 est for Jtinioi High Schools 1-15(4)1 
J*eal)od) lain ary III toimatioii Test, College 1-11(6)8 
Peahfid) Lahiais Inloiination lest, Eleinentaiy 1-13(3,4)3 
Peabody Library Inlorniation lest, High School 1-15(3)3 

Mathernnlics [Geneial) 

Cooper ati\e C'olkge Mathc‘matics Test 1-16(6)2 
Cooperative General Nfatheiriatics Test 1-10(3)2 
Cooperative Mathematics Pic- Test lor College StudcMits 1-16(6)2 
(a)opciativc Maihcmatics Test 1-16(4)2 
Cooinrativc '1 cst in Secondary School Mathematics 1-16(5)2 
Foiist-Schorliiig '‘leaf of Functioihil "Ihinking 1-16(5,6)12 
Junior High School Mathematics 4 c'st 
Acorn Ac Incvemcnt 4 ests 1-16(1,5)13 
Progressive Mathematics Test, Advanced 1-10(5,6)1 177, 178 

Pi ogrcssive Mathematics Test Advanced, machine scoring 1-16( 
Purdue Industrial Arathcmaiits lest 1-10(3)19 
Rasmussen General MatlieinatKS '1 est 1-10(3,0)16 
Rogers \chie\emcnt Test in Mathenia lies 1-100)7 

IJSAl I Tests of General 1* ducal lonal Development 
Test 5 General Matheinaiical Ability 1-16(5)2 

Mathematics (Algebra) 

Jlicshch Algebra Survey Test First Semester 1-16(5)8 
Jircslich Algebra Surv’cy Test Second Semester 1-16(5)8 
Columbia Research llureau Algebia Test, 1 1-10(3,6)12 191 

C.olunibia Research Bui can Mgebra Test, 2 1-10(5)12 

Colvin-Schraminel Algebra 'lest, Tests 1 and II 1-10(3)16 
Coopcr.itive Algebra Test — Llenieiiiary Algebra thrcjugh Quadrat 
M0(4,5)2 

Cooperative Intermediate Algebra Test 1-16(5)2 
Garman-Schianimcl Third vSemester Algebra Test 1-16(3)10 
Survey 'I t‘sL m Eleinentaiy Algebia, Douglas 1-10(5)3 
US \1T Elementary Algebia l-lt)C5)2 
USAEl Second-Year Mgebra 1-10(3)2 
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Mathematics (Calculus) 

USAFI Calculus II — ^Integral Calculus 1-16(6)2 
USAFI Djflerential Calculus I-l()(())2 

Mathematics (Gcomehy) 

American Council Solid Ceomctrv Test Form A 1-16(5,6)12 193 

Ainc'tj(aii Council Solid Geometry Test, Form 11 1-16(5,6)12 

Bccker-Schrarninel Plane Geometr) lest 1-16(5)16 
Columbia Research Jluieaii Plane Cconicti) lest 1-16(^0,6)12 
Cooper aii\e Plane Geoinetis Test 1-16(5)2 
Cooperative Solid Geometry 1 est l-ll)(r))2 V>3 

Lane-Greene Unit Tests in Pl.irre Geometry I-I6(o)6 
Orleans Plane (ieometiy Achievement I'est. I 1-16(5,6)12 
Orleans Plane Geometry Achievement 'Fesr, 2 1-U)(5,b)12 

Survey Test in Plane Geometry 1-16(5)3 
US MT Analyiu Geometry 1-16(6)2 
US'VIT Plane Geometry 1-16(5)2 

Mathematics (Ti igoiwmctiy) 

Amor Kan Council Trigonometiv lest 1-16(5,6)12 193 

Cooperative Trigonometry Test 1-16(5)2 
USAFI Plane 'rngonornetry 1-16(5,6)2 

Music and Sound 

Beach Music Test 1-17(3,4.5,6)16 

Drake Musical Memory Test lest of Musical Talent, V 1-17(2,3,4)8 

Drake Musical Mcnioiy Test Test oi Musical Talent, 11 1-17(2 3,1)8 

Knuih \chie\crricnt lests in Mumc, I’oim A, 1 1-17(2,3)3 

Kriiiih 'Vchicvement Tests in Music, Poiin A, 2 1-17(3)3 

Knuth Achievement Tests in Music, Form 3 1-17(1,5)3 

Knuth -Vchievcineiit 'I csts in Music, Foiiii 11, 1 1-17(2,3)3 

Kvvalwassei Test of Music In loi matron and Appreciation 1-17(1,5,6)6 
302 

Kvvalvsasser-Dykema Music 'lests 1-17(4 5, (r) 301 

Kvvalvvasser-Riich 'lest of Musical -Vcconiplishment 1-17(3,4,5)6 30^ 

Musk al Achievement Test 1-1 7(3,4) 1 1 
Providence Inventory Test in xMusic 1-17(3,4)12 

Seashore Measures of Musical Talents, Revised Tdition 1-17(3,4,5,6) 
301,305,307 

Strouse Music Test 1-17(3,4,5,6)16 

Western Elecii ic Audiometer, Graybar Electric Co 1-17(2,3,4,5,6) 299 

Natmal Science (General) 

Anal}ti(al Stales of Attainment, Division 3 1-18(4)3 

Analytical Scales of Attaminent, Elemental) Science 1-18(4)3 
Calvert Science Information Test, Elementary 1-18(3)1 
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Natural Science (General) Cont'd 

Calvert Science Information Test, Intermediate 
Cooperative General Science Test, High School 
Cooperative Science Test 1-18(4)2 
McDougal General Science Test, I 1-18(4,5)16 
McHoiigal General Science Test, IT 1-18(4 5)16 
Public School Vchie\enieiit Tests Nature Suidv 
Reading Scales in Science, Fonii V 1-18(4,5)3 
Reading Scales in Science, Koim B I- ISM, 5)3 

Ruch-Popenoe CJeneial Science Test M 8(4) 12 
Stanford \chie\cnieur Test — Flenicntaiy Science 
US.XFl OencTal Educational Deselopniciu 
Test 3 Interpretation of Reading MaU'nal in Natural Science 
Test 3 (Same as abo\e but loi high school) 1-18(5)2 
US VKI Geneu.l Science, High School 1-18(5)2 
USAFI Senior Science, High School 1-18(5)2 

Natural Science (4.sironomy) 

USAFI Examination in Astronomy 1-18(6)2 


1-18(4)1 

M8(5)2 


1-18(3,1)8 


1-18(3,4)12 


1-18(6)2 


Natural Science (Biology) 

Coojieiatne Biology lest, High School 1-18(5)2 
Cooperative College Biology I'est M 8(6)2 
Pressoii Biology lest Test — Animal Biology 1-18(5)12 

Presson Biology Test Test — Plant Biology 1-18(5)12 
Ruch-Cossinan Bicjlogy Test 1-18(4,5 6)12 
USAFf Biology 1-18(6)2 
USAFJ Biology, High School 1-18(5)2 
Williams Biology' J est 1-18(5)16 

Natwal Scteiur {Chemistry) 

ACS Cooperative Chcmistr) Test 1-18(6)2 

ACS Cooperative Chemistry Test in Quantilative Anal)sis 1-18(6)2 

ACS Cooperative Organic Clicmistiy Test 1-18(6)2 

ACS Cooperative Physical Chemistry Test 1-18(6)2 

('olunibia Research Bureau Chemistry Test 1-18(5,6)12 

Coopciative Biochemistry Test 1-18(6)2 

Cooperative Chemistry Test M8(4 5)2 27, 28, 191 

Cooperative Chemistry Test in Quantitative Analysis T-I8(6)2 

Differentiated Studv Guide in Chemistry 1-18(5)3 

Glcnn-Wolion Chemistry Athicvcmeiit Test 1 1-18(5)12 

Kirkpatrick Chemistry Tests, I, II 1 18(5)16 

Powers (/Cneral Chemistry Test 1-18(5)12 

USXFI Chemistry, High School 1-18(5)2 

US VFI General Chcmistiy 1-18(6)2 
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Natural Science (Geology) 

Cooperative Geology Test Historical Geology 1-18(6)2 
Cooperati\e Geology Test: Physical Geology 1-18(6)2 

Natural Science (Meteorology) 

Cooperative Pre-Flight Aeronautics Tests 
Test S Meteorology 1-18(5,6)2 
USAFI Meteorology 1-18(5)2 

Natuial Science (Physics) 

Columbia Research Bureau Physics Test 1-18(5,6)12 
Cooperative Physics T cst for College Students 1-18(6)2 194 

Cooperative Physics Test, High School 1-18(5)2 
Fulton-Sthrammel Phy sics Test, I I- 1 8(5) 1 6 
Fulton-Schr«immel Physics Test, II 1-18(5)16 
Iowa Achievement Examination in College Physics 1-18(6)6 
USAFI Physics, College, Sections I, II 1-18(6)2 
USAFI Physics, College, Section HI 1-18(6)2 
USAFI Physics, High School 1-18(5)2 

Reading 

Chapman-Cook Speed of Reading Test, Forms A and B 1-19(3,4)3 18-f 

Chicago Reading Tests, Test A 1-19(2)15 

Chicago Reading Tests, Test B 1-19(2,3)15 

Chicago Reading Tests, Test C 1-19(3)15 

Chicago Reading Tests, Test D 1-19(3,4)15 

Cooperative Vocabulary Test 1-19(4,5,6)2 

Detroit Reading Test, Test I 1-19(2)12 

Detioit Word Recognition Test I-19(2)12d 

Devault Primary Reading Test 1-19(2)1 

Diagnostic Examination of Silent Reading Ability, Intermediate 

Diagnostic Examination of Silent Reading Ability. Junior 
Diagnostic Examination of Silent Reading Ability, Senior 1-1 (^) 
Diagnostic Reading Tests Diagnostic Battery, Vocabulary 1-19(4,5,6)21 
Diagnostic Reading Tests* Diagnostic Battery, Comprehension 

1-19(4,5,6)21 ^ r 

Diagnostic Reading Tests Diagnostic Battery, Rates of Reading 

1-19(4.5.6)21 ^ ^ ^ 

Diagnostic Reading Tests, Diagnostic Battery, Word Attention 

1-19(4,5,6)21 

Diagnostic Reading Tests Survey Test 1-19(4,5,6)21 
Durrell Analysis of Reading Difficulty 1-19(2,3)12 - 

Durrell-Sullivan Reading Achietement, Intermediate 1-19(2,3)12 
Durrell-Sullivan Reading Capacity and Achievement, Primary 
1-19(2,3)12 
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Reading Confd 

Durreli-Sullivan Reading Capacity — Intermediate 1-19(2,3)12 

Emporia Silent Reading Test 1-19(2,3,4)6 174 

Examiner's Reading Diagnostic Record for High School and College 
Students 1-19(5,6)11 
Garvey Primary Reading Test 1-19(2)1 
Gates Adsanced Primary Reading Tests 1-19(2)11 181 

Gates Basic Reading Test, Form I 1-19(2,3,4)1 1 
Gates Primary Reading Tests, Form I 1-19(2)11 
Gates Reading Sur\ey, Form I 1-19(2,3,4,5) 

Gates Reading Survey for Grades 3-10, Form II 1-19(2,3,4,5) 

Haggerty Reading Examination, Sigma 1 1-19(2)12 

Haggerty Reading Examination, Sigma 3 1-19(3,4,5)12 

High School Reading Test 1-19(4,5)13 

Ingraham-Clark Diagnostic Reading Tests, Intermediate 1-19(3)1 
Ingraham-Clark Diagnostic Reading Tests, Pnraaiy 1-19(2)1 
Interpretation of Data 1-19(4,5)2 
Iowa Every-Pupil Tests of Basic Skills 
Silent Reading Comprehension, Advanced A 1-19(3,4,5) 184 

Silent Reading Comprehension, Elementary, A 1-19(2,3)5 
Work-Study Skills, Advanced, Form O 1-19(3,4)5 
Work-Study Skills, Elementary, Form O 1-19(2,3)5 
Iowa Silent Reading Tests, Advanced, Form Am 1-19(4,5,6)6 170 

Iowa Silent Reading Tests, Elementary, Form Am 1-19(3,4)6 170, 184 

Kansas Primary Reading Test 1-19(2)16 176 

Kansas Vocabulary Test 1-19(3,4)16 176 

Lee-Clark Reading Test, First Reader, Form A 1-19(2)1 
Lee-Clark Reading Test, Primer, Form A 1-19(2)1 
Los Angeles Elementary Reading Test, Form I 1-19(2,3,4)1 
Los Angeles Primary Reading Test 1-19(2)1 
Los Angeles Primary Word Recognition Test 1-19(2)1 
Manwiller Word Recognition Test 1-19(2)1 
Metropolitan Achievement Tests 
Advanced Reading 1-19(4)12 
Elementary Reading 1-19(2,3)12 
Intermediate Reading 1-19(3)12 
Michigan Speed of Reading Test 1-19(3,4,5,6)7 361 

Michigan Vocabulary Profile Test 1-19(5,6)12 172 

Minnesota Reading Examination for College Students I-l 9(4,5,6) 1 7 

Nelson Silent Reading Test, Form A 1-19(2,3,4)5 
Nelson Silent Reading Test, Form B 1-19(2,3,4)5 
Nelson Silent Reading Test, Form C I-l 9(2,3, 4;5 

Nelson-Denny Reading Test, Form A 1-19(4,5,6)5 
Nelson-Denny Reading Test, Form B 1-19(5,6)5 
Ohio Techniques in Reading Comprehension 1-19(4,5)18 
Poley Precis Test I-l 9(4, 5)8 
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Reading Confd 

Primary Reading Test, Form A 1-19(2)5 
Primary Reading Test, Form B 1-19(2)5 
Progressive Reading Tests 178 
Advanced, Form A 1-19(4,5,6)1 
Advanced, Form Am I- 1 9(4, 5,6) 1 

Elementary, Form A 1-19(5)1 
Elementary, Form Am I- 19(5) 1 
Intermediate, Form A 1-19(4)1 
Intermediate, Form Am M9(4)l 
Primary, Form A 1-19(2)1 

Public School Achievement Tests Reading 1-19(2,8,4)8 
Sangren-Woody Reading Test 1-19(5,4)12 182 

Schrammel-Gray High School and College Reading Test 1-19(4,5,6)8 
Schrammel-Wharton Vocabulary Test 1-19(4,5,6)16 
Sentence Vocabulary Test 1-19(2,8,4) 

SRA Reading Record — Buswell 1-19(2,3,4)1 

Stanford Achievement Test 
Advanced Reading, Form D 1-19(4)12 
Intermediate Reading, Form D 1-19(3)12 
Primary Reading, Form D 1-19(2)12 
Study-Habits Inventory 1-19(5,6)10 
Thorndike-Lorge Reading Test, Form A 1-19(4)11 
Thomdike-Lorge Reading Test, Foim B 1-19(4)1 1 
Traxler High School Reading Test 1-19(5)8 
Traxler Silent Reading Test 1-19(4,5)8 157 

Tyler-Kimber Study Skills 1-19(5,6)10 
Unit Scales of Attainment — ^Reading Comprehension 
Form A, Division 1 1-19(3)3 

Form A, Division 2 1-19(5)3 

Form A, Division 3, Grade 3 1-19(2)3 

Form A, Division 3, Grades 7-^ 1-19(4)3 

Form A, Division 4 1-19(4,5)3 

Form B, Division 4 1-19(4,5)3 

Form C, Division 1 1-19(3)3 

Van Wagenen Unit Scales of Attainment, Reading 1-19(4)3 161, 163 

Watson-Glaser Tests of Critical Thinking 

Battery I ’Discrimination m Reasoning 1-19(4,5,6)12 
Battery II.Logical Reasoning 1-19(4,5,6)12 
Whipple's High School and College Reading Test 1-19(4,5,6)8 
Wide Range Vocabulary Test 1-19(2,3,4,5)7 
Williams Primary Reading Test 1-19(2)8 

Social Studies (General) 

American History-Civics’National Achievement Tests 1-20(2,3)13 
Cooperative Social Studies Test 1-20(4)2 
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Social Studies {General) Cont*d, 

Cooperative Test of Social Studies Ability 1-20(5)2 
Progie&si\e Tests iii Related Sciences, Elemental y 1-20(3,1)1 
Stanford Acliie\emeiit Test Social Studies 1-20(3,1)12 
USAri General Educational Deselopment 
Test 2 Intel pi etation of Reading Mateiials in Social Studies College 
Le\el, Foim B 1-20(6)2 
H igh School Level, Foi m B 1-20(5)2 

Social Studies {Civics) 

Bi ow n-\Voody Gi \ u s Test I-20( 1,5) 1 2 

Bui ton Chics Test 1-20(3,1)12 
Goopiiatisc \niei lean Cos emnient Test T-30(5)2 

Coo]>eTati\e Canninunity Affairs Test 1-20(5)2 
Mords-Sdiraminel Ci\ us Test 1-20(4) IG 
Mordv-Stliiainmcl (.onsLitution lest 1-20(5,6)16 
bSVl'I CiMcs 1-20(5)2 
USVl‘1 Problems of Democracy 1-20(5)2 

Social Studies (ErormmuA) 

Cooperative Economics lest 1-20(5,0)2 
Social Studies {History) 

American History Zests Naiional Achietenient Tests 1-20(1)13 

Cooperative American History 1 (St 1-20(5 6)2 J97 

Cooperative Ancient History Test 1-20(5)2 

Junior A.niencan Histoiy Test 1-20(4)12 

K*insas American History Tests 1 and 11 1-20(5)16 

Kansas Modern European Histoiy Tests 1 and 11 1-20(5)16 

Kniss Woild History Test 1-20(5)12 

Public School Achicv(‘iiient Tests History 1-20(3,4)8 

Rending Scales in History, Forms A and B 1-20(4 5)3 

7'ay lor-Schrannncl W'oi Id I T islory T cst l-20( 5,6) 1 6 

USAH American History 1-20(5,6)2 

USAFI Modern European 1-20(5,6)2 

USAFI World Histoi y 1-20(5)2 

Spanish 

Columbia Research Bureau Spanish Test 1-21(5,6)12 
Cooperative Spanish Test, Advanced 1-21(5 6)2 
Cooperative Spanish Test, Elemental v 1-21(5,6)2 
Cooperative Spanish 'I'est, Higher Level 1-21(5,6)2 
Cooperative Spanish Test, Lower I cvel 1-21(5,6)2 
US Vri SpaiiKsh Grammar 1-21(5)2 
US\F1 Spanish Reading Comprehension 1-21(5)2 
USAFI Spanish Vocabulary 1-21(5)2 
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Spelling 

Buckingham Spelling Scale 1-22(3.4,5)8 i79 

Davis-Sdirammel Spelling Test 1-22(^3,4)16 
Kansas Spelling Test, Tests I, II, III 1-22(2,3,4)16 176 

Wellesley Spelling Scale 1-22(5,6)1 176 

Teaching Skills 

Camgan Score Card for Rating Teaching and the Teacher 1-23(6)12 
How I Teach, Purdue Teacher’s Examination 1-23(6)3 

Vocational Skills 

Industrial Training Classification Test, A 1-24(5.6)9 

Industrial Training Classification Test, B 1-24(5,6)9 

Mechanical Drawing Test 1-24(5,6)16 

Mechanical Drawing Test (Indiana Tests) 1-24(4,5)19 

Purdue Blueprint Reading Test, Owen, Arnold 1-24(5.6)9 

Purdue Interview Aids, Lawshe 1-24(6)9 

Purdue Test for Electricians, Caldwell 1-24(5,6)9 

Purdue Test for Machinists and Machine Operators 1-24(4,5,6)9 

Test for Ability to Sell 1-24(4,5,6) 

Test of Practical Judgment, Cardall 1-24(6)9 

Mtscellaneotcs 

Animal Husbandry Test (Indiana Tests) 1-24(4,5)19 

Farm Shop Tools (Indiana Tests) 1-24(4,5)19 

Kefauver-Hand Recreational Guidance Test 1-24(4,5,6)12 

Tests of Human Growth and Development, Horrocks Sc Troyer 1-25(6) 

USAFI Examination in Elementary Psychology 1-25(6)2 

IL APTITUDE 
Applied Science 

Engineering and Physical Science Aptitude Test 11-1(5,6)7 269 

Physical Science Aptitude Exam 11-1(5,6)6 

Stanford Scientific Aptitude Test 11-1(4,5)10 

Batteries of Tests {General) 

Chicago Tests of Primary Mental Abilites 11-4(4,5,6)9 213, 232, 238 

Crow Tests for High School Entrants 11-4(3,4)13 
Detroit General Aptitudes Examination 11-4(3,4,5)8 

Differential Aptitude Tests 11-4(4,5)7 241, 238 

General Aptitude Test Battery, U.S. Employment Service, Washington, 
D C 11-5(4,5,6) 243, 238 

Graduate Record Examinations 11-4(6)3 203^208 

Guilford-Zimmerman Aptitude Survey 11-4(4,5,6)20 238, 238 

Yale Educational Aptitude Tests 11-4(4,5,6) 243, 231 
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Business 

Clerical Aptitude Tests TT-5(43,6)I3 
Detroit Ctciual Aptitudes Iixamination 11-5(5)8 
ERCi Stenographic Aptitude Test 11-5(5,6)9 
Minnesota Cleiical Test 11-5(1 5 6)7 23, 376 

Psytliologifal Corporanon General Clerical 'fest 11-5(4 j)7 158, 203 

SR V Clental Apiitude 11-5(5.6)9 203 

SRA Dictation Skills 11-5(1,5)9 203 

SRA Langihige Skills, Stonogi a fillers 11-5(4,5)9 203 

Srcnograpliic Apntude I'e-si 11-5(1 5 6)7 203 

1 iiisc Shoithand \ptiiude 'I est 11-5(1,5)12 

La/tn 

Oileaiis-Solomon Latin Prognosis Test 11-14(5,6)12 
Mai/irmaiirs 

CaliLoinia Algebia Aptitude Test, Kevs 11-16(4 5)3 
loua Algebra Aptitude 'Pest 11-16(4,5)6 
Iowa Plane Geonietiy \plirudc Test 11-16(4,5)6 
Lee Test of Algebraic Abilits 11-16(4)8 
Lee Test of Geometric Aptitude 11-16(5,6)1 

Oilc.iris Mgebia Prognosis Test 11-16(1,5)12 

Orleans Cieoinetry Prognosis Test 11-16(5,6)12 

Reading Readiness 

Betts Rcad>-io-Read Tests, keystone View Co , Mcachille, Pa 11-19(1,2) 
102 

Gates Reading Readiness Tests 11-19(1,2)11 

Lee-Claik Reading Readiness Test 11-19(1,2)1 

Metropolitan Readiness "Jests 11-19(1,2)12 99 

Monroe Reading Aptitude lest 11-19(2)5 100 

Murph)-Diniell Diagnostic Reading Readiness lest 11-19(1,2)12 

Stevens Reading Readiness Test 11-19(1,2)12 

Van Wagenen Reading Readiness Test 11-19(1)3 

Teaching Skills 

Coxe-Orlcans Prognosis Test of Teaching Ability 11-23(5,6)12 
Stanlord tdiicatumal Aptitude "J’est 11-23(6)10 

Vocational Skills 

Acorn Mechanical Aptitude Tests 11-24 (4 5,6) 1 3 

Bennett Hand-Tool Dexterity lest 11-24(4,5,6)7 

Detroit Mechanical Aptitude Examination 11-24(4,5,6)8 30 

Detroit Retail Selling Inventory 11-24(5,6)8 

Dynaiiionieicr, Stoelting 11-21(4,5,6) 21 

Earncs L ye 1 est 11-24(3 4,5,6)12 298 
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Vocational Skills Confd 

Farnsworth Dichotomous Test for Color Blindness; Panel D-15 
11-24(2,3,4,5,6)7 

Farnsworth-Munsell 100-Hue Test for Anomalous Color Vision 
11-24(2,3,4,5,6)7 298 

Hams Tests of Lateral Dominance 11-24(2,3,4.5,6)7 
Keystone Telebinocular, Keystone View Co 11-24(2,3,4,5,6) 298 

MacQuarrie Test for Mechanical Ability 11-24(5,6)1 
Mechanical Aptitude Tests 11-24(4,5,6)13 
Mellenbruch Curve-Block Senes 11-24(4,5,6) 

Mellenbruch Mechanical Aptitude Test 11-24(4,5,6)9 
Minnesota Assembly Test, Stoeltmg Co. 11-24(4,5,6) 29, 376 

Minnesota Rate of Manipulation 11-24(4,5,6)3 21, 282, 284, 292 

Minnesota Spatial Relations Test 11-24(4,5,6)3 292, 376 

Ophthalmograph, American Optical Co. 11-24(2,3,4,5,6) 

Orthorater, Bausch and Lomb Optical Co. 11-24(2,3,4,5,6) 296 

Oseretsky Test of Motor Proficiency 11-24(1,2,3,4,5)3 
Pennsylvania Bi-Manual Worksample, Roberts 11-24(4,5,6)3 
Personnel Selection and Classification Test 11-24(6)1 
Prognostic Test of Mechanical Abilities 11-24(4,5,6)1 270 

Purdue Hand Precision Test 11-24(4,5,6)19 
Purdue Industrial Training Classification Test 11-24(4,5,6)9 
Purdue Mechanical Adaptability Test 11-24(4,5,6)19 285 

Purdue Pegboard 11-24(4,5,6)9 278, 292 

Revised Minnesota Paper Formboard Test, AA and BB 11-24(3,4,5,6)7 

SRA Mechanical Aptitudes 11-24(5,6)9 267 

Stenquist Mechanical Aptitude Test 11-24(3,4,5)12 287 

Survey of Mechanical Insight, Miller 11-24(4,5,6) 1 

Survey of Object Visualization, Miller 11-24(4,5,6)1 

Survey of Space Relations Ability, Case-Ruch 11-24(4,5,6)1 

Survey of Working Speed and Accuracy 11-24(6)1 

Test of Mechanical Comprehension 11-24(4,5,6)7 

Tracing Apparatus, Stocking 11-24(4,5,6) 21 

U S Employment Svce., Finger and Manual Dexterity 11-24(4,5,6) 


Miscellaneous 

Cancellation Test 11-25(6)3 

Foreign Language Prognosis Test, Forms A and B 11-25(4)11 

How Supervise? Forms A and B 11-25(6)7 

Iowa Legal Aptitude Test 11-25(5,6)6 

Iowa Placement Examinations, Form A 11-25(5,6)6 

Iowa Placement Examinations, Form M 11-25(5,6)6 

Law School Admission Test 11-25(6)2 

Luria-Orleans Modern Language Prognosis Test 11-25(5,6)12 
Medical College Admission Examination 11-25(6)2 207 , , ^ 

Wilson Tests of Religious Aptitude, I, II, HI, IV, V, and VI 11-25(4,5.6)16 
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III INIELLIGLNCE 
1 . Gioup 

\catlcmic Aptitude 1 ests Xon-\ crbal Tntelligciue llH(-15,b)I3 
Academic Aptitude Tests Vcibal IiiielligCMKC JIM (4 j b)13 
Ad<i]>tabilit\ lest, Form V 111-1(3, b)9 
Ad\am('d Pciception of Relations Scales 111-1(6)3 
American Council on rducation Psychological Lxanunations 
College Freshmen III-1((5)2 226 , 210 

High School Students 111-1(5)2 
Army General Classification lest (cnilian) 111-1(1 ), 6)9 329 

California Capacity Questionnaire Ill-^l.'j.r)}! 

California Intelligence Test Language Test 
Ach anted Senes 111-1(4 5,b)l 

AcUanted Scries, machine scoring IIM(J,5b) 

Elementary Senes 111-1(8,4)1 
Intcrniediate Series 111-1(4 5)1 
Intermediate Series, machine sconng 111-1(4 5 b;l 
Pre-Pi unary Senes 111-1(1,2)1 
Caliloinia Intelligence Test Non-I^inguage 
Ads anted Series 111-1(5,6}! 

\chaiiced Scries, machine scoring 111-1(4,5,6)1 
llementaiy Senes 111-1(3,1)1 
Intermediate Series 111-1(4,5,6)1 
Intermediate Senes, machine sconng 111-1(4,5 6)1 
Prc-Priinary Senes 111-1(1,2)1 
Primaiy Series 111-1(2)1 
California Mental Maturity 2/7, 248 
Advanced 111-1(4 5,6)1 
Elementary 111-1(8 4)1 
Intermediate 111-1(4 5)1 
Pre-Primary 111-1(1,2)1 
Primaiy 111-1(2)1 

Carnegie Mental Ability Test, lonn A 111-1(5,6)5 
Chicago Primary Mental Abilities 111-1(1,2)16 233 , 252 , 253 , 258 , 

411’-116 

Colc-Vmcent Pi unary Group Intelligence Test 111-1(1,2)16 
Culture-tree Intelligence Tests 111-1(2,8,4 5,6) 228 

Culture-tree 1 est, Caitell 111-1(4,5,6)7 
Dearborn Group Intelligence, Senes 1 111-1(1,2,3)1 

Dearborn Group Intelligence, Senes 11 111-1(8,4,5)3 

Detroit Advanced Intelligence Test III-l (4,5, b)8 
Detioit Alpha Intelligence Test 111-1(3 4)8 
Detroit Primary Intelligence Test 111-1(2,3)8 
Haggcily Intelligence Examination, Delta 1 111-1(2)12 226 

Haggerty Intelligence Examination, Delta 2 111-1(2,8,4)12 
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1 Group Cont*d 

Henmon-Nelson Mental Ability, College. A HM(6)5 
Henmon-Nelson Mental Ability, College, B 111-1(6)5 
Henmon-Neison Mental Ability, Elementary 111-1(2,3,4)5 
Henmon-Nelson Mental Ability, High School 111-1(4,5)5 
Inductne Reasoning Test 111-1(5,6)3 

Junior Scholastic Aptitude Test, Educational Records Bureau 111-1(4) 
Kent Senes of Emergency Scales 111-1(1,2,3,4)7 
Kuhlmann-Anderson Tests, Grade 1-1 111-1(2)3 323 

Kuhlmann-Anderson Tests, Grade 1-2 111-1(2)5 

Kuhlmann-Anderson Tests, Grade 2 111-1(2)3 

Kuhlmann-Anderson Tests, Grade 3 111-1(2)3 

Kuhlmann-Anderson Tests, Grade 4 111-1(3)3 

Kuhlmann-Anderson Tests, Grade 5 111-1(3)3 

Kuhlmann-Anderson Tests, Grade 6 111-1(3)3 

Kuhlmann-Anderson Tests, Grades 7-8 111-1(4)3 

Kuhlmann-Anderson Tests, Grades 9 and up 111-1(4,5,6)3 
Langmuir Oral Directions Test 111-1(4,5,6)7 226 

Miller Mental Ability Test, Form A 111-1(4,5,6)12 
Modified Alpha Examination, Form 9 111-1(4,5,6)7 

Multi-Mental Scale 111-1(2,3,4,5,6)11 
National Intelligence Tests, Scale A, Form 1 111-1(2,3,4)12 

New California Short-Form Test of Mental Maturity 
Advanced 111-1(4,5,6)1 
Advanced, machine scoring 111- 1 (4,5,6) 1 
Elementary 111-1(3,4)1 
Intermediate 111-1(4,5,6)1 
Intermediate, machine scoring 111-1(4,5,6)1 
Pre-Primary III- 1 (4,5,6) 1 

Primary 111-1(2)1 

Non-Language Multi-Mental Test, Form A 111-1(3,4,5,6)11 
Non-Language Multi-Mental Test, Form B 111-1(3,4,5,6)11 
O’Rourke General Classification Test 111-1(5,6) 227 

Ohio State University Psychological Test 111-1(5,6)18 
Otis Classification Test (Revised) 111-1(3,4)12 226 

Otis Classification Test for Industrial and Office Personnel, Western 
Reserve University Press 111-1(4,5,6) ^ 
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Ability, general, 225; and intelligence, 
255; comparison of batteries, 256 
Abstraction (see also Perception, Ap- 
perception), 146 

Academic (see also Scholastic), piedic- 
tions from tests, 249 
Accomplishment quotient (AQ), 214 
Adiicvement test, 37, correlational 
analysis. 214 
Adaptive behavior, 86 
Age level, 117, 127, ISO; for instruction, 
216 

Age norms, 126 
Age score, 118 
Agencies using tests, 9 
Aggressiveness. 642 
Agility tests, 273 
Alcohol, 510 
Algebra tests, 189 

Altitude test (same as Power test), 39 
Ambiguity, 69 

Analogies test, 225; spatial, 414; verbal, 
118 

Apperception, thematiq, 503; three 
dimensional, 545 

Appraisal, of a test, 56; varieties of, 18 
Aptitude test, 38, 238, 244^ 246; results, 
247 

Arithmetic tests, 178, 187; diagnostic, 
189 

Art, and intelligence, 322; analytical 
studies, 323, appreciation, 307; com- 
position, 310, 317, drawing ability. 


317; nature of, 36; preferences for 
pictures, 307; responses to color, 313; 
responses to lines, 312 
Ascendance, 365 f 

Association test, words, 530; pictures, 
503 

Attenuation, 396 

Attitude, comparison of techniques of 
appraisal, 623, definitions, 594; fac- 
torial analysis, 613, group norms, 625, 
modification of, 616; needed research, 
626; reasons for, 606; relation to in- 
formation, 620; toward church, 321; 
typical scales, 486, 607; public opin- 
ion. 596; soaal distance, 603; study of 
values, 607 
Audiometer, 299 
Auditory, 298 
Auditory span, 22 
Average, 363 
Axes. 409, rotation, 410 

Bar chart, 380 

Batteries, achievement, 156, 158; apti- 
tude, 233; mechanical. 269; military, 
327; motor, 281 
Behavior rating, 708 
Bias, 597 
Bi-modal, 366 
Bmet-type test, 112 
Biographies, 552, 693 
Blodt counting, 225, 330, design, 145 
Borderline, 133 
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Breadth test, 40 
Brown-Spearman formula, 50 
Business achievement, 201; informa- 
tion, 205 

Gentile (same as Percentile), 47, S6l 
Central tendency, 363, a\erage, 361; 

mean, 364, mode, 365; median, 366 
Chance, correction for, 70; results, 71 
Child Development Abstracts, 8 
Cognitive process, 21 
Childhood, early, 71 
Class interval (i) (same as Step), 356 
Clerical aptitude tests, 23, 25, 183 
Clerical test, 201; typing, 202; short- 
hand, 202 

Clinical, uses of tests, 10, 150, 151, 415; 
physique, 435 

Coeffiaent of alienation (k), 399 
Combining scores, 376 
Completion test, 61, 532 
Concept, formation, 146, 482 
Conduct, 698, lying, stealing, 699, per- 
sistence, 701, reliability, 727 
Confidence interval, 372 
Construction, of questionnaire items, 
464; of test items, 66, of tests, 59 
Correlation coefficient, analysis of, 395; 
applications, 394, calculation, 382; 
errors, 77, interpretations, 385, 399, 
probable errors, 398, product-mo- 
ment, 384, rank order, 387; biserial, 
389, tetrachoric, 388 
Correlation matrix, 403 
Correlation table, 382, 385 
Criteria (see also Validity), Binet, 123; 
interests, 591, industrial, 291; aca- 
demic, 149 
Critical ratio, 393 
Critical score, 366 
Cube Imitation Test, 142, 144 
Culture-free testing, 228 
Cumulative frequency, 360 
Curve, cumulative, 360, growth, 109, 
126, normal frequency, 359; ogive, 
360; regression, 382; shape, 366 

Deafness, 299 
Deales, 361 

Development, inf^ts, 103; diagnosis, 
79 

Developmental age, 84; schedules, 86 
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Deviations, 365 
Dextent) tests, 276 
Difficult) of an item, 43 
Dispersion (same as Scatter and Varia- 
btht\), of scoies, 367; uses of measures 
of, 372 

Distribution of scores, 356 
Drawing abilits, composition and rep- 
resentation, development of, 478; as 
a measure of intelligence, 90, 479, as 
indications of personality, 473 

Educational Age (E \), 208 
Educational Quotient (EQ), 213 
Educational tests, uses of, 10, objectives, 
156, types, 156. results, 208 
Educational Testing Service, 9 
Emotion, 638, 6&l 
Essay test, 164 
Ethical standards, 11 
Examiner, requirements of, 13; training 
of. 15 

Expression, verbal, 160 
Fables, 521 

Faaal expression, 721 
Factor, chance factors, 408; common 
factors, 408. general factor, 404, group 
factor, 404, specific factors, 405 
Factorial analysis, advantages, 51, 400, 
421; limitations, 420, of abilities, 411; 
of traits, 411; results, Thurstone*s 
method, 405 

Fantasy (same as Story), 502 
Feeble-minded, 133 
Fiduciary limits, 370 
Forced choice ratings, 458, 589 
Fore exercise, 66, 220 
Foreign languages, 200 
Frequency table, 356; polygon, 858; 
curve, 368; cumulative, 360; normal, 
359 

Frustration, 515 

Gaussian Curve (same as Normal ptoba- 
btltty curve), 359 
Geometry tests, 190 
Gestalt, drawing, 474; motor, 480 
Grade equivalent, 208 
Graduate examinations, 206 
Grammar tests, 174, 175 
Graphology, 489 
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Group testing, 41, 225, projection 
sketches, 525 

Growth, caily childhood, 107; attitudes, 
605 

Guessing, chance success, 70, told not to 
guess, 73 

Halo eftect, 461 

Handwriting, personality and, 489-494; 

scales, 160 
Hearing, 298 
Histogram, SIS 
Histoiy test, 168 

Idiot, 133 
Imbecile, 133 

Individual predictions, 390 
Individual test, 40, use of, 291 
Infanc), 79 
Inflection point, 359 
Insight, 518 

Instruction, optimum age for, 216 
Intangibility of items, 459 
Integration, 32, 485, 703 
Intelligence quotient (IQ), constancy of, 
135, for adults, 134, interpretation of, 
288; similarity, of average, 132; of 
standard deviations, 132; of gifted 
children, 137 

Intelligence test, group, 222; individual, 
112 

Interest (same as Desire, Wish, Like), 
case histones, 552, definitions, 550; 
factor analysis, 579, 580; inventories, 
vocational, 561; job satisfaction, 449; 
needed research, 582; permanence, 
prediction of scholastic success, 576; 
prediction of vocational success, 577; 
reasons for academic choices, 554; 
reasons for vocational choices, 577 
Internal consistency, 63, homogeneity, 
62 

Interpolation, 361 

Interviews, 36, 711; group, 711; thera- 
peutic, 717, employment, 711, 719; 
stress, 720, 721; tests in, ^6 
Introversion, 636 

Inventories, personality, 447, 627; in- 
terest, 550, attitude, 595 
Item, analysis, 53, construction, 59; 
merits of various types, 61, 62, 63, 64; 
types of, 61, 261; number of items, 590 


Job satisfaction, 449 

Knowledge tests, 30; requirements, 65; 
word knowledge tests, 165; history, 
168, vocabulary, 166 
Kurtosis (Ku), 369 

Language, defined, 157, measuies of, 
158, foreign, 200, order of develop- 
ment, 86, 158; techniques of reading, 
180 

Lateiality, 102 
Learning, 22 
Letter grades, 361, 363 
Lie detecting, 726 
Limits of a score, 255 
Limitations of testing, 16, of factorial 
analysis, 394 

Literatuie, appreciation of sounds, 499, 
discrimination of style, 495, informa- 
tion, 498; merit of prose and poetry, 
601 

Logs, 35, 558 
Logical errors, 461 

Mental deficiency, 133 
Man-to-man rating scale, 457 
Matching test, 61 

Mathematics, 187, anthmetic tests, 187, 
algebra, geometry, and trigonometry 
tests, 189; vocabulary, 172 
Maturity, age of, 130 
Maze test, 145 

Mean, 364, standard error of, 370 
Mechanical tests, Minnesota assembly, 
29, paper and pencil, 30, 265, correla- 
tions, 286, results m industry, 
291 

Median (Md), 366 
Medical aptitude, 209 
Mental age, changes in meaning, 130, 
interpretation, 129; of national 
groups, 150, scaling, 127 
Mental hygiene, 199 
Military, 327; US Army, 223, 225, 328, 
AGCT, 330, civilian occupation, 533; 
individual test, 335, AAF, Army Air 
Force, 337; stanmes, 339, Navy, 340, 
batteries compared, 343; personality, 
344, results, 348 
Mode, 365 

Moral knowledge, 599 
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Moron, 133 
Mosaic tests, 482 
Motivation, 31, 548 

Motor abihtv, 20; co-oidination, 281, 
and spatial thinking, 281, infants, 86; 
correlational anahsis, 285; rhythms, 
280; tests, 261; gestalt, 480 
Multiple choice test, 61 
Music, 300, appieciation, 304, correla- 
tional analysis, 305; criticism of tests, 

306, information, 301; racial differ- 
ences, 307 

Needs, 506 

Neurotic tendencies, 509 
Nominating technique, 459 
Non-veibal tests, infants, 79, 89; adults, 
138, 228 
Norm, 47, 48 

Normal probabilits -curv'e (same as 
Gatisstan Cun>e), 359 
Null hypothesis, 372 

Objectivity, 44 

Observational method, 705, 723 
Occupations, 150, 333 
Ogive, 360 
Omnilius test, 226 
Opinion, 695 
Ordinate, 359, 371 
Orthogonal, 411 

Painting, 473 

Paired Comparisons Method, 448 
Percentile, 361 

Perception, 21, of mechanical detail, 
265; tests of reading, 182 
Performance scales, 138 
Personality (same as 5e/f), 17; theory, 
425; inventories, 627, 664; psychiatric, 
630, psychological, 634, and physique, 
641; factorial analysis, 644, 658; 
humor, 648, mental health, 650, socio- 
logical, 654 
Phi coefficient, 55 

Physical, 19, infants, 85; adults, 430 
Physical sciences, 194 
Physiological measures, 19 
Physique, 430; somatotypes, 431, 433, 
morphological measures, 435 
Picture interpretation, 503, preference, 

307, 529, make-a-picture-story, 542 
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Plaving. 537; role playing. 539; word 
test, 540 

Pov\er test (same as Attitude test), 39 
Prediction, of probable stores, 48; 
scholastic, 219, vocational, 218; from 
preschool tests, 110, fiom Bmet tesb, 
1 19; from group tests, 255 
Press, 507 

Primarv mental abilities, 93, 233; de- 
fined, 257, measured, 411 
Piobabilitv. 370 
Probable enor (PE), 368 
Piofile chau, 375, 376, 377 
Piojective techniciucs, 32; infants. 87; 
ink blots, 668, pictures, 503, words, 
518 

Propaganda, 616 
Proverbs. 632 

Psychiatric classification, 435 
Psvchoanalytic thcorv, 439, 522, 529 
Piycholof'tcal Abstracts, 7 
Psvchological tests, 39, theories of per- 
sonality, 414 

Psychophvsical measures, 20 
Puzzle- tv pe tests, 283 

Quartile, 361, deviation (Q), 368 
Questionnaire. 561 

Random errors, 396 
Range, 367 

Rank. 361, order comparison, 451, order 
correlation, 387 
Rate tests, 39 

Rating, scale methods, 462; rules. 464, 
468, validity, 462, 469, inventories, 
448, paired comparison, 449; rank 
order, 451; forced choice, 458, graphic 
scales, 455; man to man, 457, nomi- 
nating, 459, number of steps, 460, in- 
tangibility, 459; types, 448; errors, 
461, wording of items, 465; anony- 
mous, 466; combination of, 468 
Raw score, 357, 366 
Reaction time, 270 

Reading. 181, difficulty, 186; organiza- 
tion, 181, 188 
Reading readiness, 99-103 
Reasoning, 25 
Regression equation, 382 
Reliability, 50; coefficient, 100; of larger 
samples, 101 
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Rho (p), 387 
Rorschach tests, 668, and mental age, 
norms, reliability, 685; sconng, 676, 
typical reports. 686, administration. 
669; group Rorschach, 673; interpre- 
tation, 682, occupational, 688; gen- 
eral Roischach score, 678, clinical, 689 

Sample, 52 

Scale, 5, age, 126; types of, 448; number 
of Items, 460, zero, 352, equal inter- 
vals, 452 

Scatter diagram (see also Correlation 
table), 379, 380 

Scholastic predictions, 110, 149, 215 
Score, dispel Sion of scores, 367; interpre- 
tation, 355; limits, 255; range of 
scores, 367, raw, 367; standard, 365; 
standard deviation, 365; weighted, 
376 

Sconng, 44, 45, 46 
Semi-mterquartile range (Q), 368 
Sentence, structure, 174; completion in 
personality testing, 532 
Skewness, 369 

Sodal adjustment {see also Mental 
hygiene), 198, 697 
Social sacnccs, 194, studies, 200 
Soaometry, 697 
Spatial visualization, 266 
Spearman-Brown Prophecy Formula, 60 
Speed tests, 39 
SpcUing, 176, 179 
Split-group method, 53 
Standard Deviation (SD, or <r), calcula- 
tion, 369; of a score, 392; scaling, 371 
Standard error of estimate, 390 
Standard error of an obtained score, 392; 
of a mean, 370; of a sigma, 371; of a 
difference, 392 
Standard score, 374 
Standardized test, 6, 37 
Stanines, 339 
Steadiness, 278 
Steps* equivalence of, 452 
Story, 602 


Strength, 273 

Substitution test (see also Associa 
test), 24, 143 

T score, 371, 374 
Tabulation, 357 

Test Items, analysis, 61; chance fac 
70, construction, 61; completion, 
types, 60, matching, 69; mull 
choice, 68 

Tests, achievement, 37, 155; types, 
formal, 40, agencies using, 9; ager 
providing, 8; construction, 6, 
ethical standards, 11; sources of 
formation about, 7 
Thematic apperception, 503 
Time allowance, 64 
Time-sampling, 35, pie-school child 
691 

Trait. 17 

Trigonometry, 189 
True-false test, 61 
Two-factor test, 230 

Uniqueness, 51 

U S Employment Service, 8, 243, 26i 

Validity, 48, coefficient, 49; face, 49 
Variability (same as Dispersion) (see 
Range, Quartile Deviation, Stanc 
Deviation), 367 
Values, study of, 607 
Variance, 407 
Vision, 295 

Vocabulary, profile, 172 
Vocation, choice, 55 1; interest im 
tones, 561, predictions from in' 
tories, 577, predictions fiom tests, 
253, satisfactions, 449; factorial a 
yses of interests, 579 

W-factor, 662 
Weighted scotes, 376 
Work-limit, 39 

Zero, 452 



